Installation of packages.
install.packages("tidyverse")
install.packages("arules")
install.packages("data.table")
install.packages("summarytools")
install.packages("ggpubr")
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("limma")
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("Glimma")
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("edgeR")
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("org.Hs.eg.db")
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("GO.db")
install.packages("car")
install.packages("caret")
install.packages("RColorBrewer")
install.packages("gplots")
install.packages("FactoMineR")
install.packages("factoextra")Import of packages.
library(readr)
library(readxl)
library(data.table)
library(tidyverse)
library(readr)
library(arules)
library(dplyr)
library(summarytools)
library(ggpubr)
library(limma)
library(Glimma)
library(edgeR)
library(org.Hs.eg.db)
library(GO.db)
library(car)
library(caret)
library(RColorBrewer)
library(gplots)
library(FactoMineR)
library(factoextra)
library(gprofiler2) # ADDED: use gprofiler2 for annotation# raw counts dataset
raw_counts <- read.csv("CCLE_RNAseq_reads.csv")
# mutations dataset
CCLE_mutations <- read.csv("CCLE_mutations.csv")
# drug response dataset
drug_response <- read.csv("primary-screen-replicate-collapsed-logfold-change.csv")
# sample information dataset
sample_info <- read_excel("sample_info.xlsx")
# drug information dataset
drug_info <- read.csv("primary-screen-replicate-collapsed-treatment-info.csv")Within the coursework for the subject “Knowledge Extraction from Biological Databases”, we extracted 4 datasets from the DepMap portal (https://depmap.org/portal/). The portal provides multiple datasets aiming to make various cancer cell line data openly available. It includes the Achilles project (gene essentiality across many cell lines) and the CCLE project (Cancer Cell Line Encyclopedia) which profiles genetic and pharmacological characteristics of many cancer models.
We were asked to use an RNA-seq expression dataset for protein-coding genes (raw_counts), which is our primary dataset. Specifically, the dataset corresponds to raw counts from the Achilles project and CCLE characterization data. We also used a Mutation Annotation Format (MAF)-based dataset representing mutations derived from VCF files produced at project level. This dataset (CCLE_mutations) is our secondary dataset. As metadata we used the drug_response dataset which measures viability of cell lines exposed to non-oncology compounds (log fold changes). Finally, sample_info contains information about the cell lines referenced in the other datasets.
## [1] 1379 52440
## 'data.frame': 1379 obs. of 52440 variables:
## $ X : chr "ACH-001113" "ACH-001289" "ACH-001339" "ACH-001538" ...
## $ TSPAN6..ENSG00000000003. : num 2383 2529 1552 5657 22806 ...
## $ TNMD..ENSG00000000005. : num 0 13 0 0 0 5 0 0 0 0 ...
## $ DPM1..ENSG00000000419. : num 4766 3857 8057 6056 4702 ...
## $ SCYL3..ENSG00000000457. : num 962 555 849 1057 1109 ...
## $ C1orf112..ENSG00000000460. : num 1518 833 2512 656 1500 ...
## $ FGR..ENSG00000000938. : num 2 0 2 0 98 0 1 2 1 3 ...
## $ CFH..ENSG00000000971. : num 162.8 19.1 201.8 7713 31296 ...
## $ FUCA2..ENSG00000001036. : num 565 1000 13834 7928 5914 ...
## $ GCLC..ENSG00000001084. : num 9078 1758 2189 3473 22393 ...
## $ NFYA..ENSG00000001167. : num 4280 1684 2892 3378 4773 ...
## $ STPG1..ENSG00000001460. : num 902 192 853 555 349 ...
## $ NIPAL3..ENSG00000001461. : num 3364 1555 2923 4916 1127 ...
## $ LAS1L..ENSG00000001497. : num 2276 2206 2436 4630 4639 ...
## $ ENPP4..ENSG00000001561. : num 702 1841 210 2562 245 ...
## $ SEMA3F..ENSG00000001617. : num 985 127 42 1173 10801 ...
## $ CFTR..ENSG00000001626. : num 0 2 7 4 6 ...
## $ ANKIB1..ENSG00000001629. : num 7618 7225 5936 4261 6102 ...
## $ CYP51A1..ENSG00000001630. : num 10875 13374 8497 10992 19852 ...
## $ KRIT1..ENSG00000001631. : num 4151 2423 3843 1604 3140 ...
## $ RAD52..ENSG00000002016. : num 723 348 480 378 685 ...
## $ MYH16..ENSG00000002079. : num 5 0 1 177 8 ...
## $ BAD..ENSG00000002330. : num 812 996 792 1631 2862 ...
## $ LAP3..ENSG00000002549. : num 1775 4528 7263 3411 5468 ...
## $ CD99..ENSG00000002586. : num 1921 3205 16483 5631 8217 ...
## $ HS3ST1..ENSG00000002587. : num 23 17 156 2364 2924 ...
## $ AOC1..ENSG00000002726. : num 2 1 9 5 1 ...
## $ WNT16..ENSG00000002745. : num 4 21 11 10 1 7 9 109 9 27 ...
## $ HECW1..ENSG00000002746. : num 235 40 3 10 1 1 11 5 3 1 ...
## $ MAD1L1..ENSG00000002822. : num 523 428 3075 1179 2498 ...
## $ LASP1..ENSG00000002834. : num 6268 4195 18445 22524 32144 ...
## $ SNX11..ENSG00000002919. : num 1489 1412 2563 1214 1823 ...
## $ TMEM176A..ENSG00000002933. : num 0 1 6 2 0 ...
## $ M6PR..ENSG00000003056. : num 14879 10859 21628 13951 18353 ...
## $ KLHL13..ENSG00000003096. : num 4217 610 18 1 625 ...
## $ CYP26B1..ENSG00000003137. : num 987 14 67 242 115 ...
## $ ICA1..ENSG00000003147. : num 98 569 300 1078 3162 ...
## $ DBNDD1..ENSG00000003249. : num 225 1041 1032 839 564 ...
## $ ALS2..ENSG00000003393. : num 4261 2512 3562 1331 2762 ...
## $ CASP10..ENSG00000003400. : num 362 21 214 1092 802 ...
## $ CFLAR..ENSG00000003402. : num 3717 2329 3133 2825 4989 ...
## $ TFPI..ENSG00000003436. : num 5 4 1524 615 85 ...
## $ NDUFAF7..ENSG00000003509. : num 1518 844 1307 996 946 ...
## $ RBM5..ENSG00000003756. : num 3522 6354 5124 5651 13458 ...
## $ MTMR7..ENSG00000003987. : num 67.5 998.1 99.2 16.8 281.1 ...
## $ SLC7A2..ENSG00000003989. : num 627 948 778 463 8211 ...
## $ ARF5..ENSG00000004059. : num 3937 5066 4865 5098 8778 ...
## $ SARM1..ENSG00000004139. : num 910 560 794 490 1645 ...
## $ POLDIP2..ENSG00000004142. : num 4473 5112 8044 4381 8407 ...
## $ PLXND1..ENSG00000004399. : num 3121 716 16777 118 727 ...
## $ AK2..ENSG00000004455. : num 5782 5588 21212 17554 14712 ...
## $ CD38..ENSG00000004468. : num 17 5 66 15 21 61 29 12 4 27 ...
## $ FKBP4..ENSG00000004478. : num 13558 9604 8382 16632 15173 ...
## $ KDM1A..ENSG00000004487. : num 4516 6694 6924 8978 7596 ...
## $ RBM6..ENSG00000004534. : num 4403 4151 3612 4622 11895 ...
## $ CAMKK1..ENSG00000004660. : num 1975 120 389 371 1307 ...
## $ RECQL..ENSG00000004700. : num 6428 3426 11936 5962 4192 ...
## $ VPS50..ENSG00000004766. : num 1960 1726 2663 1238 1398 ...
## $ HSPB6..ENSG00000004776. : num 13 6.35 18.66 20.62 2 ...
## $ ARHGAP33..ENSG00000004777. : num 456 587 161 91 631 ...
## $ NDUFAB1..ENSG00000004779. : num 1726 2417 3305 5642 4286 ...
## $ PDK4..ENSG00000004799. : num 0 0 347 837 224 ...
## $ SLC22A16..ENSG00000004809. : num 1 4 3 1 0 0 0 1 1 1 ...
## $ ZMYND10..ENSG00000004838. : num 23 79 24 6 15 35 9 120 268 23 ...
## $ ABCB5..ENSG00000004846. : num 51 0 136 0 3 6 5 7 1 2 ...
## $ ARX..ENSG00000004848. : num 1 33 8 0 6 3 877 1 1 8 ...
## $ SLC25A13..ENSG00000004864. : num 758 2449 6220 2542 5473 ...
## $ ST7..ENSG00000004866. : num 791 1904 2463 1532 1639 ...
## $ CDC27..ENSG00000004897. : num 12186 6558 13839 8543 5757 ...
## $ SLC4A1..ENSG00000004939. : num 7 0 0 2 1 0 0 0 0 10 ...
## $ CALCR..ENSG00000004948. : num 6 27 2 3 5 2 10 0 0 11 ...
## $ HCCS..ENSG00000004961. : num 1049 1480 2550 2870 2552 ...
## $ DVL2..ENSG00000004975. : num 1615 3214 4376 1723 3177 ...
## $ PRSS22..ENSG00000005001. : num 13 0 6 2506 1169 ...
## $ UPF1..ENSG00000005007. : num 4928 3025 3969 4581 12645 ...
## $ SKAP2..ENSG00000005020. : num 1406 91 11509 12761 4104 ...
## $ SLC25A5..ENSG00000005022. : num 24646 15202 42244 43089 49243 ...
## $ MCUB..ENSG00000005059. : num 190 5360 1038 1176 1533 ...
## $ HOXA11..ENSG00000005073. : num 16.4 2 21 2 1013.4 ...
## $ POLR2J..ENSG00000005075. : num 2455 2809 4882 1685 4629 ...
## $ DHX33..ENSG00000005100. : num 6164 2193 5646 2293 4776 ...
## $ MEOX1..ENSG00000005102. : num 1 0 0 0 0 4 1 0 2 1 ...
## $ THSD7A..ENSG00000005108. : num 5 50 16 3 50 ...
## $ LIG3..ENSG00000005156. : num 3678 6501 3007 1473 3835 ...
## $ RPAP3..ENSG00000005175. : num 4348 2819 4411 2815 3401 ...
## $ ACSM3..ENSG00000005187. : num 35.6 0.6 16.8 23.9 179.7 ...
## $ REXO5..ENSG00000005189. : num 321 1564 862 1293 586 ...
## $ CIAPIN1..ENSG00000005194. : num 3276 2750 5890 3176 4350 ...
## $ SPPL2B..ENSG00000005206. : num 1333 539 903 783 3976 ...
## $ FAM214B..ENSG00000005238. : num 1219 1211 1786 945 174 ...
## $ COPZ2..ENSG00000005243. : num 58 0 2199 387 4 ...
## $ PRKAR2B..ENSG00000005249. : num 578 2134 3902 102 110 ...
## $ MSL3..ENSG00000005302. : num 1107 560 1492 2569 2880 ...
## $ CREBBP..ENSG00000005339. : num 5318 3030 3435 3295 5221 ...
## $ TSPOAP1..ENSG00000005379. : num 1 0 4 12 72 280 30 578 912 7 ...
## $ MPO..ENSG00000005381. : num 0 0 0 0 3 29 4 57 88 9 ...
## $ PON1..ENSG00000005421. : num 0 1 4 0 2 0 19 2 0 1 ...
## $ GCFC2..ENSG00000005436. : num 961 1459 1897 1949 1713 ...
## $ WDR54..ENSG00000005448. : num 445 614 1416 2798 1224 ...
## [list output truncated]
The raw_counts dataset has 1379 rows (cell lines) and 52440 columns (genes). Cell lines are identified by “DepMapID” and genes by “Hugo Symbol” followed by “ENSEMBL id” in parentheses. All variables are numeric except “X1” which contains the cell line labels.
## [1] 1288288 32
## 'data.frame': 1288288 obs. of 32 variables:
## $ Hugo_Symbol : chr "VPS13D" "AADACL4" "IFNLR1" "TMEM57" ...
## $ Entrez_Gene_Id : int 55187 343066 163702 55219 7579 5453 23139 115361 10451 100288142 ...
## $ NCBI_Build : int 37 37 37 37 37 37 37 37 37 37 ...
## $ Chromosome : chr "1" "1" "1" "1" ...
## $ Start_position : int 12359347 12726308 24484172 25785018 33954141 38512139 46498028 89657103 108247170 148346689 ...
## $ End_position : int 12359347 12726322 24484172 25785019 33954141 38512139 46498028 89657103 108247170 148346689 ...
## $ Strand : chr "+" "+" "+" "+" ...
## $ Variant_Classification: chr "Nonsense_Mutation" "In_Frame_Del" "Silent" "Frame_Shift_Ins" ...
## $ Variant_Type : chr "SNP" "DEL" "SNP" "INS" ...
## $ Reference_Allele : chr "C" "CTGGCGTGACGCCAT" "G" "-" ...
## $ Tumor_Seq_Allele1 : chr "A" "-" "A" "A" ...
## $ dbSNP_RS : chr "" "rs58218425|rs139261871|rs369427733|rs560787141" "" "" ...
## $ dbSNP_Val_Status : chr "" "byFrequency" "" "" ...
## $ Genome_Change : chr "g.chr1:12359347C>A" "g.chr1:12726308_12726322delCTGGCGTGACGCCAT" "g.chr1:24484172G>A" "g.chr1:25785018_25785019insA" ...
## $ Annotation_Transcript : chr "ENST00000358136.3" "ENST00000376221.1" "ENST00000327535.1" "ENST00000374343.4" ...
## $ DepMap_ID : chr "ACH-000001" "ACH-000001" "ACH-000001" "ACH-000001" ...
## $ cDNA_Change : chr "c.6122C>A" "c.786_800delCTGGCGTGACGCCAT" "c.1011C>T" "c.789_790insA" ...
## $ Codon_Change : chr "c.(6121-6123)tCa>tAa" "c.(784-801)tcctggcgtgacgccatc>tcc" "c.(1009-1011)ggC>ggT" "c.(790-792)aaafs" ...
## $ Protein_Change : chr "p.S2041*" "p.WRDAI263del" "p.G337G" "p.K264fs" ...
## $ isDeleterious : chr "True" "False" "False" "True" ...
## $ isTCGAhotspot : chr "False" "False" "False" "False" ...
## $ TCGAhsCnt : num NA NA NA 0 NA 0 0 NA NA 0 ...
## $ isCOSMIChotspot : chr "False" "False" "False" "False" ...
## $ COSMIChsCnt : num 0 3 0 0 0 0 0 0 0 0 ...
## $ ExAC_AF : num NA NA NA NA NA ...
## $ Variant_annotation : chr "damaging" "other non-conserving" "silent" "damaging" ...
## $ CGA_WES_AC : chr "34:213" "57:141" "118:0" "" ...
## $ HC_AC : chr "" "" "" "" ...
## $ RD_AC : chr "" "" "" "" ...
## $ RNAseq_AC : chr "" "" "" "6:28" ...
## $ SangerWES_AC : chr "" "" "" "" ...
## $ WGS_AC : chr "" "" "" "" ...
The CCLE_mutations dataset has 1,288,288 rows (mutations) and 32 descriptive columns. It contains identifiers (e.g., Entrez_Gene_Id, Hugo_Symbol, DepMap_ID) and mutation details (type, position, etc.). The first 11 columns represent part of the MAF format, while the remaining were added by CCLE (e.g., cDNA_Change, isDeleterious, Variant_annotation). DepMap_ID links the mutations to cell lines allowing joins with the other datasets.
## [1] 578 4687
## 'data.frame': 578 obs. of 4687 variables:
## $ X : chr "ACH-000001" "ACH-000007" "ACH-000008" "ACH-000010_FAILED_STR" ...
## $ BRD.A00077618.236.07.6..2.5..HTS : num -0.0156 -0.0957 0.3795 0.1189 0.1453 ...
## $ BRD.A00100033.001.08.9..2.5..HTS : num -0.449 0.258 -0.596 -0.232 -0.499 ...
## $ BRD.A00147595.001.01.5..2.5..HTS : num 0.489 0.772 0.548 0.622 0.267 ...
## $ BRD.A00218260.001.03.4..2.5..HTS : num 0.207 -0.439 0.422 -0.203 0.158 ...
## $ BRD.A00376169.001.01.6..2.5..HTS : num 0.273 -0.733 -0.217 -1.005 -0.272 ...
## $ BRD.A00520476.001.07.4..2.5..HTS : num 0.021 0.7792 0.0819 -0.2137 0.2078 ...
## $ BRD.A00546892.001.02.6..2.5..HTS : num -0.02546 0.42652 0.14534 0.02025 0.00446 ...
## $ BRD.A00578795.001.04.3..2.5..HTS : num 0.467 -1.289 -0.571 -0.795 -0.192 ...
## $ BRD.A00758722.001.04.9..2.5..HTS : num -0.736 -0.476 -0.512 NA -0.31 ...
## $ BRD.A00827783.001.24.6..2.5..HTS : num 0.644 -0.277 0.453 0.68 0.113 ...
## $ BRD.A00993607.003.24.6..2.5..HTS : num -0.3711 -1.2491 0.2528 -0.0387 0.2683 ...
## $ BRD.A01098288.001.02.9..2.5..HTS : num 0.425 -0.525 0.174 -0.336 0.185 ...
## $ BRD.A01412266.001.01.0..2.5..HTS : num NA 0.40064 0.00387 0.13676 -0.69812 ...
## $ BRD.A01493904.003.12.1..2.5..HTS : num 0.287 -0.7326 -0.0115 -0.7053 -0.0609 ...
## $ BRD.A01563671.001.02.7..2.5..HTS : num 0.4105 0.0659 0.4705 -0.0295 0.4257 ...
## $ BRD.A01593789.001.03.1..2.5..HTS : num 0.405 -1.1162 -0.0331 -0.1378 0.3738 ...
## $ BRD.A01636364.003.15.1..2.5..HTS : num 0.7023 0.527 0.0066 -0.1854 0.4973 ...
## $ BRD.A01643550.001.04.9..2.5..HTS : num -0.6487 -1.6317 -0.0465 NA -0.3948 ...
## $ BRD.A01787639.003.16.9..2.5..HTS : num 0.53716 -0.57987 -0.27487 0.40747 -0.00193 ...
## $ BRD.A01907367.001.01.7..2.5..HTS : num -0.0519 0.556 -0.4662 -0.1097 -0.3315 ...
## $ BRD.A02006392.001.16.4..2.5..HTS : num -0.463 0.423 -0.446 -0.896 -0.39 ...
## $ BRD.A02180903.001.04.5..2.5..HTS : num 0.213 0.177 0.43 -0.397 0.408 ...
## $ BRD.A02710418.003.11.8..2.5..HTS : num 0.514 0.192 -0.354 -0.117 -0.441 ...
## $ BRD.A02743701.001.03.6..2.5..HTS : num 0.5616 -0.0157 0.1392 0.173 -0.1205 ...
## $ BRD.A02759312.003.24.5..2.5..HTS : num 0.0924 0.125 0.2828 -0.2759 0.4065 ...
## $ BRD.A02990301.003.06.6..2.5..HTS : num -0.841 -0.365 -0.393 -0.659 -0.114 ...
## $ BRD.A03061970.003.07.3..2.5..HTS : num 0.61 0.24 0.149 -0.458 -0.124 ...
## $ BRD.A03216249.003.24.3..2.5..HTS : num -1.401 -0.292 0.158 -0.544 -1.085 ...
## $ BRD.A03359064.001.02.3..2.5..HTS : num 0.774 -0.131 0.665 0.188 0.338 ...
## $ BRD.A03506276.001.01.5..2.5..HTS : num -1.41 -2.57 -3.69 -3.3 -5.07 ...
## $ BRD.A03623303.045.09.5..2.5..HTS : num -0.139 -0.366 -0.743 -0.597 -0.831 ...
## $ BRD.A03880619.001.01.0..2.5..HTS : num 0.266 -0.213 NA NA 0.195 ...
## $ BRD.A03932035.004.04.3..2.5..HTS : num 0.648 0.54 -0.147 NA -0.41 ...
## $ BRD.A04203391.001.08.3..2.5..HTS : num -0.383 0.541 -0.297 -0.467 -0.268 ...
## $ BRD.A04252265.003.01.4..2.5..HTS : num NA 0.5777 -0.0203 -0.5163 -0.0502 ...
## $ BRD.A04322457.003.17.9..2.5..HTS : num 0.0587 0.5525 -0.0493 0.6027 0.1814 ...
## $ BRD.A04327189.001.15.1..2.5..HTS : num -0.329 0.066 -0.33 -0.235 -0.315 ...
## $ BRD.A04327189.003.03.3..2.5..HTS : num 0.228 0.543 0.194 NA 0.532 ...
## $ BRD.A04352665.001.05.3..2.5..HTS : num 0.213 0.283 0.528 NA 0.521 ...
## $ BRD.A04497688.001.04.2..2.5..HTS : num 0.179 0.48 0.123 0.611 0.354 ...
## $ BRD.A04506385.001.01.2..2.5..HTS : num NA -0.3566 -0.2249 -0.1493 -0.0816 ...
## $ BRD.A04553218.050.16.2..2.5..HTS : num 0.3445 -0.0371 -0.0985 0.06 0.1954 ...
## $ BRD.A04661934.001.02.9..2.5..HTS : num 0.255 0.259 -0.176 0.135 0.109 ...
## $ BRD.A05186015.003.19.8..2.5..HTS : num 0.0909 0.5407 0.2209 0.1777 0.3522 ...
## $ BRD.A05334458.001.02.8..2.5..HTS : num 0.223 0.992 0.275 0.559 0.169 ...
## $ BRD.A05457250.001.08.4..2.5..HTS : num -0.5334 -0.1141 -0.3143 -1.3938 0.0947 ...
## $ BRD.A05515753.001.03.2..2.5..HTS : num 0.0463 0.1645 0.0135 -0.2616 0.1079 ...
## $ BRD.A05523972.001.01.5..2.5..HTS : num -0.0745 0.4815 0.3689 0.5272 -0.2871 ...
## $ BRD.A05674712.001.06.0..2.5..HTS : num -0.2568 -0.7986 0.0222 NA 0.2414 ...
## $ BRD.A05729358.001.02.0..2.5..HTS : num 0.2683 0.4561 0.0127 0.2433 0.1397 ...
## $ BRD.A05906449.004.01.1..2.5..HTS : num -0.209 -0.397 0.302 0.114 0.213 ...
## $ BRD.A06352418.001.25.9..2.5..HTS : num -0.302 -1.052 -1.207 NA 0.193 ...
## $ BRD.A06352508.001.03.7..2.5..HTS : num -0.191 -2.594 -0.925 -0.124 -0.161 ...
## $ BRD.A06390036.001.04.6..2.5..HTS : num -0.1465 0.3022 -0.1323 -0.1116 0.0304 ...
## $ BRD.A06426627.001.03.5..2.5..HTS : num 0.616 -1.634 -0.289 -0.25 -0.642 ...
## $ BRD.A06627858.236.03.0..2.5..HTS : num -1.54 -3.06 -3.52 -5.01 -2.15 ...
## $ BRD.A06771424.004.01.7..2.5..HTS : num 1.0642 0.5683 0.3165 -0.5771 0.0337 ...
## $ BRD.A06916187.001.02.9..2.5..HTS : num -0.122 0.196 0.524 -0.195 0.176 ...
## $ BRD.A06935312.001.04.3..2.5..HTS : num 0.5016 0.905 0.2082 -0.1517 0.0261 ...
## $ BRD.A07000685.001.04.4..2.5..HTS : num 0.063 0.565 0.548 0.577 0.741 ...
## $ BRD.A07164067.001.01.2..2.5..HTS : num -0.0156 -0.3998 -0.3246 0.1358 0.1764 ...
## $ BRD.A07207424.001.13.2..2.5..HTS : num 0.56 0.477 -0.512 -0.462 -0.522 ...
## $ BRD.A07232941.001.02.7..2.5..HTS : num -0.2076 0.6881 0.0506 0.02 0.2224 ...
## $ BRD.A07395371.003.10.0..2.5..HTS : num -0.2712 0.3746 0.0555 0.2218 0.559 ...
## $ BRD.A07440155.003.25.4..2.5..HTS : num -0.218 0.474 0.204 0.568 0.397 ...
## $ BRD.A07563059.035.01.3..2.5..HTS : num -0.188 0.979 0.268 0.333 0.337 ...
## $ BRD.A07600638.060.05.2..2.5..HTS : num 0.0809 -0.3191 0.1392 NA 0.1176 ...
## $ BRD.A07704283.001.01.3..2.5..HTS : num 0.577 0.716 0.142 -0.289 0.106 ...
## $ BRD.A07780951.001.08.9..2.5..HTS : num -0.0392 -0.0235 0.3309 0.5397 0.0186 ...
## $ BRD.A07780951.065.10.0..2.58..HTS : num 0.872 -0.329 NA NA 0.479 ...
## $ BRD.A07815743.001.01.1..2.5..HTS : num 0.6981 -0.0198 0.2876 -0.6879 0.1377 ...
## $ BRD.A07870296.001.07.4..2.5..HTS : num 0.592 -0.0405 0.0565 0.5407 0.0993 ...
## $ BRD.A07893380.213.01.7..2.5..HTS : num 0.345 -0.407 0.24 NA 0.137 ...
## $ BRD.A07932845.050.15.8..2.5..HTS : num 0.0749 0.5231 -0.1526 -0.4104 0.2943 ...
## $ BRD.A07947329.001.01.4..2.5..HTS : num 0.3441 -0.3001 -0.2426 0.0326 -0.5534 ...
## $ BRD.A07986123.001.02.8..2.5..HTS : num -0.302 -0.01 -0.189 NA -0.301 ...
## $ BRD.A08079565.001.01.8..2.5..HTS : num -0.1685 -0.3897 -0.0983 -0.5774 -0.1868 ...
## $ BRD.A08187463.001.12.9..2.5..HTS : num -0.684 -1.348 -0.343 NA -0.13 ...
## $ BRD.A08255417.001.14.7..2.5..HTS : num -0.135 0.751 0.494 0.038 0.312 ...
## $ BRD.A08302182.001.01.7..2.5..HTS : num 0.3335 -0.6729 -0.0754 -0.6886 -0.1436 ...
## $ BRD.A08545410.003.07.8..2.38..HTS : num 0.0693 0.6077 1.1546 NA 0.6525 ...
## $ BRD.A08545410.003.08.6..2.5..HTS : num 1.095 -0.539 -0.167 -0.327 -0.086 ...
## $ BRD.A08545410.311.01.8..2.5..HTS : num -0.0386 0.1407 -0.1815 NA 0.0935 ...
## $ BRD.A08545410.311.03.4..2.5..HTS : num -1.209 -3.258 -0.924 NA -0.909 ...
## $ BRD.A08660406.001.05.9..2.5..HTS : num 0.0498 -1.3761 -0.7559 NA -0.0463 ...
## $ BRD.A08715367.001.01.7..2.87..HTS : num 0.151 0.328 -0.21 -1.394 -0.328 ...
## $ BRD.A08840375.001.01.3..2.5..HTS : num -1.4 -3.34 -1.57 -2.5 -1.11 ...
## $ BRD.A09056319.003.07.7..2.5..HTS : num -0.126 0.488 0.225 0.403 0.191 ...
## $ BRD.A09062839.003.10.0..2.5..HTS : num -0.5967 0.5649 -0.4324 -0.4458 0.0947 ...
## $ BRD.A09094913.065.01.0..2.5..HTS : num -0.4175 -0.2839 -0.0766 -0.6086 -0.4415 ...
## $ BRD.A09349126.001.10.7..2.5..HTS : num 1.407 -0.7497 0.0472 NA 0.1506 ...
## $ BRD.A09370961.001.01.0..2.5..HTS : num 0.66 -0.375 -1.089 NA -0.287 ...
## $ BRD.A09467419.003.22.4..2.5..HTS : num 0.1244 1.3703 0.1209 NA 0.0886 ...
## $ BRD.A09472452.015.23.4..2.5..HTS : num -0.331 0.2393 -0.1565 NA 0.0392 ...
## $ BRD.A09533288.003.31.2..2.5..HTS : num 0.0396 -0.1382 -0.1224 NA 0.3315 ...
## $ BRD.A09554849.236.07.6..2.5..HTS : num 0.7528 0.5457 -0.0935 -0.4412 0.2627 ...
## $ BRD.A09722536.002.18.0..2.5..HTS : num 0.20442 -0.00205 0.13307 0.53681 0.46134 ...
## $ BRD.A09735281.001.02.5..2.5..HTS : num -0.209 -0.616 0.109 -0.222 0.143 ...
## [list output truncated]
The drug_response dataset has 578 rows (cell lines) and 4687 columns (drugs). It contains normalized log fold-change cell viability values per drug. Cell lines are identified by “DepMapID” and drugs by “Broad ID”. All variables are numeric except the “X” column which contains labels.
## [1] 1811 26
## tibble [1,811 × 26] (S3: tbl_df/tbl/data.frame)
## $ DepMap_ID : chr [1:1811] "ACH-000001" "ACH-000002" "ACH-000003" "ACH-000004" ...
## $ cell_line_name : chr [1:1811] "NIH:OVCAR-3" "HL-60" "CACO2" "HEL" ...
## $ stripped_cell_line_name : chr [1:1811] "NIHOVCAR3" "HL60" "CACO2" "HEL" ...
## $ CCLE_Name : chr [1:1811] "NIHOVCAR3_OVARY" "HL60_HAEMATOPOIETIC_AND_LYMPHOID_TISSUE" "CACO2_LARGE_INTESTINE" "HEL_HAEMATOPOIETIC_AND_LYMPHOID_TISSUE" ...
## $ Alias : chr [1:1811] "OVCAR3" NA "CACO2, CaCo-2" NA ...
## $ COSMICID : num [1:1811] 905933 905938 NA 907053 NA ...
## $ sex : chr [1:1811] "Female" "Female" "Male" "Male" ...
## $ source : chr [1:1811] "ATCC" "ATCC" "ATCC" "DSMZ" ...
## $ Achilles_n_replicates : num [1:1811] NA NA NA 2 2 NA 2 NA 3 NA ...
## $ cell_line_NNMD : num [1:1811] NA NA NA -3.08 -2.4 ...
## $ culture_type : chr [1:1811] NA NA NA "Suspension" ...
## $ culture_medium : chr [1:1811] NA NA NA "RPMI + 10% FBS" ...
## $ cas9_activity : num [1:1811] NA NA NA 52.4 86.6 NA 47.7 NA 94.2 NA ...
## $ RRID : chr [1:1811] "CVCL_0465" "CVCL_0002" "CVCL_0025" "CVCL_0001" ...
## $ WTSI_Master_Cell_ID : num [1:1811] 2201 55 NA 783 NA ...
## $ sample_collection_site : chr [1:1811] "ascites" "haematopoietic_and_lymphoid_tissue" "Colon" "haematopoietic_and_lymphoid_tissue" ...
## $ primary_or_metastasis : chr [1:1811] "Metastasis" "Primary" NA NA ...
## $ primary_disease : chr [1:1811] "Ovarian Cancer" "Leukemia" "Colon/Colorectal Cancer" "Leukemia" ...
## $ Subtype : chr [1:1811] "Adenocarcinoma, high grade serous" "Acute Myelogenous Leukemia (AML), M3 (Promyelocytic)" "Adenocarcinoma" "Acute Myelogenous Leukemia (AML), M6 (Erythroleukemia)" ...
## $ age : num [1:1811] 60 35 NA 30 30 64 63 56 72 NA ...
## $ Sanger_Model_ID : chr [1:1811] "SIDM00105" "SIDM00829" "SIDM00891" "SIDM00594" ...
## $ depmap_public_comments : logi [1:1811] NA NA NA NA NA NA ...
## $ lineage : chr [1:1811] "ovary" "blood" "colorectal" "blood" ...
## $ lineage_subtype : chr [1:1811] "ovary_adenocarcinoma" "AML" "colorectal_adenocarcinoma" "AML" ...
## $ lineage_sub_subtype : chr [1:1811] "high_grade_serous" "M3" NA "M6" ...
## $ lineage_molecular_subtype: chr [1:1811] NA NA NA NA ...
The sample_info dataset has 1811 rows (cell lines) and 26 descriptive columns, including identifiers (DepMap_ID, COSMICID, CCLE_Name), cell line features (source, culture_type, lineage), and donor info (sex, age). DepMap_ID is key to relate datasets.
# process sample_info
# Set as characters
sample_info$COSMICID <- as.character(sample_info$COSMICID)
sample_info$WTSI_Master_Cell_ID <- as.character(sample_info$WTSI_Master_Cell_ID)
# Set as factors
sample_info$sex <- as.factor(sample_info$sex)
sample_info$source <- as.factor(sample_info$source)
sample_info$culture_type <- as.factor(sample_info$culture_type)
sample_info$sample_collection_site <- as.factor(sample_info$sample_collection_site)
sample_info$primary_or_metastasis <- as.factor(sample_info$primary_or_metastasis)
sample_info$primary_disease <- as.factor(sample_info$primary_disease)
sample_info$Subtype <- as.factor(sample_info$Subtype)
sample_info$lineage <- as.factor(sample_info$lineage)
sample_info$lineage_subtype <- as.factor(sample_info$lineage_subtype)
sample_info$lineage_sub_subtype <- as.factor(sample_info$lineage_sub_subtype)
sample_info$lineage_molecular_subtype <- as.factor(sample_info$lineage_molecular_subtype)
# process mutations dataset
# Set as characters
CCLE_mutations$Hugo_Symbol <- as.character(CCLE_mutations$Hugo_Symbol)
CCLE_mutations$Entrez_Gene_Id <- as.character(CCLE_mutations$Entrez_Gene_Id)
CCLE_mutations$NCBI_Build <- as.character(CCLE_mutations$NCBI_Build)
# Set as factors
CCLE_mutations$Strand <- as.factor(CCLE_mutations$Strand)
CCLE_mutations$Variant_Classification <- as.factor(CCLE_mutations$Variant_Classification)
CCLE_mutations$Variant_Type <- as.factor(CCLE_mutations$Variant_Type)
CCLE_mutations$dbSNP_Val_Status <- as.factor(CCLE_mutations$dbSNP_Val_Status)
CCLE_mutations$DepMap_ID <- as.factor(CCLE_mutations$DepMap_ID)
CCLE_mutations$Variant_annotation <- as.factor(CCLE_mutations$Variant_annotation)After identifying variable types we performed manual curation to ensure correct formats for statistical analysis. Several character columns were factorized (e.g., sex), and some numeric-looking identifiers were converted to character (e.g., Hugo_Symbol). Missing values in some non-numeric variables were left as-is or handled later if needed.
## [1] 0
## [1] 1955870
## [1] 76337
## [1] 14818
We found NAs in sample_info, drug_response and CCLE_mutation. Only drug_response’s NAs (numeric) need immediate treatment because many downstream methods cannot handle NAs. For drug_response we imputed column-wise mean values for NA entries, reasoning that replacing missing responses for a drug with that drug’s mean is more appropriate than using a gene-wise mean.
# replace NA in drug_response with column mean
for (i in 2:ncol(drug_response)) {
m = mean(drug_response[,i], na.rm= TRUE)
drug_response[is.na(drug_response[,i]),i] = m
}
# count NA
sum(is.na(drug_response)) ## [1] 0
As previously mentioned, datasets are linked by DepMap IDs. We created subsets to keep only cell lines common across datasets for downstream analyses.
# Rename columns to DepMap_ID
names(drug_response)[names(drug_response) == "X"] <- "DepMap_ID"
names(raw_counts)[names(raw_counts) == "X"] <- "DepMap_ID"
# Create subsets with cell lines present in drug_response
# Subset raw_counts x CCLE_mutation
sub_raw_counts <- raw_counts %>% filter(raw_counts$DepMap_ID %in% CCLE_mutations$DepMap_ID)
dim(sub_raw_counts)## [1] 1372 52440
# Subset sub_raw_counts x drug_response
sub_raw_counts <- sub_raw_counts %>% filter(sub_raw_counts$DepMap_ID %in% drug_response$DepMap_ID)
dim(sub_raw_counts)## [1] 559 52440
# Subset sub_raw_counts x sample_info
sub_raw_counts <- sub_raw_counts %>% filter(sub_raw_counts$DepMap_ID %in% sample_info$DepMap_ID)
dim(sub_raw_counts)## [1] 559 52440
# Subset CCLE_mutations x sub_raw_counts
sub_CCLE_mutations <- CCLE_mutations %>% filter(CCLE_mutations$DepMap_ID %in% sub_raw_counts$DepMap_ID)
# Subset sample_info x sub_raw_counts
sub_sample_info <- sample_info %>% filter(sample_info$DepMap_ID %in% sub_raw_counts$DepMap_ID)
# Subset drug_response x sub_raw_counts
sub_drug_response <- drug_response %>% filter(drug_response$DepMap_ID %in% sub_raw_counts$DepMap_ID) Intersecting datasets yielded 559 cell lines common to all datasets. drug_response was the most limiting dataset.
print(dfSummary(sub_CCLE_mutations, style = 'grid', graph.magnif = 1, valid.col = FALSE,
max.distinct.values = 5, col.widths = c(100, 200, 200, 350, 500, 250),
dfSummary.silent = TRUE, headings = FALSE, justify = 'l')
, method = 'render', max.tbl.height = 500)| No | Variable | Stats / Values | Freqs (% of Valid) | Graph | Missing | ||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Hugo_Symbol [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 2 | Entrez_Gene_Id [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 3 | NCBI_Build [character] | 1. 37 |
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 4 | Chromosome [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 5 | Start_position [integer] |
|
460934 distinct values | 0 (0.0%) | |||||||||||||||||||||||||||||||
| 6 | End_position [integer] |
|
459239 distinct values | 0 (0.0%) | |||||||||||||||||||||||||||||||
| 7 | Strand [factor] | 1. + |
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 8 | Variant_Classification [factor] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 9 | Variant_Type [factor] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 10 | Reference_Allele [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 11 | Tumor_Seq_Allele1 [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 12 | dbSNP_RS [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 13 | dbSNP_Val_Status [factor] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 14 | Genome_Change [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 15 | Annotation_Transcript [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 16 | DepMap_ID [factor] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 17 | cDNA_Change [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 18 | Codon_Change [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 19 | Protein_Change [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 20 | isDeleterious [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 21 | isTCGAhotspot [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 22 | TCGAhsCnt [numeric] |
|
78 distinct values | 360019 (72.1%) | |||||||||||||||||||||||||||||||
| 23 | isCOSMIChotspot [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 24 | COSMIChsCnt [numeric] |
|
233 distinct values | 25 (0.0%) | |||||||||||||||||||||||||||||||
| 25 | ExAC_AF [numeric] |
|
3781 distinct values | 390013 (78.1%) | |||||||||||||||||||||||||||||||
| 26 | Variant_annotation [factor] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 27 | CGA_WES_AC [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 28 | HC_AC [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 29 | RD_AC [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 30 | RNAseq_AC [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 31 | SangerWES_AC [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||
| 32 | WGS_AC [character] |
|
|
0 (0.0%) |
Generated by summarytools 1.1.4 (R version 4.5.1)
2025-08-28
After cleaning variable types we computed descriptive statistics for sub_sample_info and sub_CCLE_mutations. We omitted summaries for sub_raw_counts and sub_drug_response due to the high number of numeric variables and computational cost. For sub_CCLE_mutations: chromosome 1 is the most frequent (~10%); Variant_Classification shows Missense_Mutation (~58%) and Silent (~25%); Variant_Type is mostly SNP (~91%). Some annotation columns have many missing values (dbSNP), and some added CCLE columns have many unique values and many missing entries.
print(dfSummary(sub_sample_info, style = 'grid', graph.magnif = 1, valid.col = FALSE,
max.distinct.values = 5, col.widths = c(100, 200, 200, 350, 500, 250),
dfSummary.silent = TRUE, headings = FALSE, justify = 'l')
, method = 'render', max.tbl.height = 500)| No | Variable | Stats / Values | Freqs (% of Valid) | Graph | Missing | ||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | DepMap_ID [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||
| 2 | cell_line_name [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||
| 3 | stripped_cell_line_name [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||
| 4 | CCLE_Name [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||
| 5 | Alias [character] |
|
|
542 (97.0%) | |||||||||||||||||||||||||||||||||||
| 6 | COSMICID [character] |
|
|
164 (29.3%) | |||||||||||||||||||||||||||||||||||
| 7 | sex [factor] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||
| 8 | source [factor] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||
| 9 | Achilles_n_replicates [numeric] |
|
|
120 (21.5%) | |||||||||||||||||||||||||||||||||||
| 10 | cell_line_NNMD [numeric] |
|
439 distinct values | 120 (21.5%) | |||||||||||||||||||||||||||||||||||
| 11 | culture_type [factor] |
|
|
247 (44.2%) | |||||||||||||||||||||||||||||||||||
| 12 | culture_medium [character] |
|
|
121 (21.6%) | |||||||||||||||||||||||||||||||||||
| 13 | cas9_activity [numeric] |
|
292 distinct values | 123 (22.0%) | |||||||||||||||||||||||||||||||||||
| 14 | RRID [character] |
|
|
2 (0.4%) | |||||||||||||||||||||||||||||||||||
| 15 | WTSI_Master_Cell_ID [character] |
|
|
164 (29.3%) | |||||||||||||||||||||||||||||||||||
| 16 | sample_collection_site [factor] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||
| 17 | primary_or_metastasis [factor] |
|
|
66 (11.8%) | |||||||||||||||||||||||||||||||||||
| 18 | primary_disease [factor] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||
| 19 | Subtype [factor] |
|
|
15 (2.7%) | |||||||||||||||||||||||||||||||||||
| 20 | age [numeric] |
|
77 distinct values | 142 (25.4%) | |||||||||||||||||||||||||||||||||||
| 21 | Sanger_Model_ID [character] |
|
|
112 (20.0%) | |||||||||||||||||||||||||||||||||||
| 22 | depmap_public_comments [logical] |
|
559 (100.0%) | ||||||||||||||||||||||||||||||||||||
| 23 | lineage [factor] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||
| 24 | lineage_subtype [factor] |
|
|
1 (0.2%) | |||||||||||||||||||||||||||||||||||
| 25 | lineage_sub_subtype [factor] |
|
|
272 (48.7%) | |||||||||||||||||||||||||||||||||||
| 26 | lineage_molecular_subtype [factor] |
|
|
500 (89.4%) |
Generated by summarytools 1.1.4 (R version 4.5.1)
2025-08-28
For sub_sample_info: sex distribution ~41% Female, 53% Male, 5% Unknown. Source labs show top providers (ATCC, DSMZ, HSRRB). Mean number of replicates ~2.1. Culture type mainly Adherent (~90%). cas9_activity mean ~74.9 (SD 15.3). primary_or_metastasis distribution Primary 59%, Metastasis 41%. Age mean 53. Many missing entries in some variables.
# Pie Chart Variant type
w = table(sub_CCLE_mutations$Variant_Type)
t = as.data.frame(w)
slices <- t$Freq
lbls <- t$Var1
pct <- round(slices/sum(slices)*100)
lbls <- paste(lbls, pct) # add percents to labels
lbls <- paste(lbls,"%",sep="") # add % to labels
pie(slices,labels = lbls, col = c("darkslategray1", "coral", "darkolivegreen1", "navajowhite"),
main="Variant Type")w2 = table(sub_CCLE_mutations$Variant_annotation)
t2 = as.data.frame(w2)
# Pie Chart Variant annotation
slices <- t2$Freq
lbls <- t2$Var1
pct <- round(slices/sum(slices)*100)
lbls <- paste(lbls, pct) # add percents to labels
lbls <- paste(lbls,"%",sep="") # add % to labels
pie(slices,labels = lbls, col = c("darkslategray1", "coral", "darkolivegreen1", "navajowhite"),
main="Variant Annotations")## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 0.8271 0.438
## 414
# There is no evidence to suggest the variance across groups is statistically significantly different
# Compute ANOVA
res.aov <- aov(sub_sample_info$age ~ sub_sample_info$sex)
# Summary of ANOVA
summary(res.aov)## Df Sum Sq Mean Sq F value Pr(>F)
## sub_sample_info$sex 2 542 270.8 0.826 0.438
## Residuals 414 135683 327.7
## 142 observations deleted due to missingness
# No differences
# Boxplot Age vs Sex
boxplot(age ~ sex, data = sub_sample_info, main = "Age vs Sex", xlab = "Sex", ylab = "Age (years)", col = c("darkslategray1", "coral", "darkolivegreen1"))# Histogram Cas9_Activity
hist(sub_sample_info$cas9_activity, main = "Cas9 Activity", xlab = "Percentage", ylab = "Number of cell lines", col = "navajowhite") We created pie charts for Variant_Type and Variant_annotation, performed ANOVA on age vs sex (no significant difference), and plotted boxplot and histogram for supporting visuals.
To run differential expression we selected metadata: primary_or_metastasis (type) and a derived variable mut_counts (count of mutations per cell line) from CCLE_mutations.
# 1st metadata --> count mutations per cell line
# group by cell lines
mutation_counts <- CCLE_mutations %>%
group_by(DepMap_ID) %>%
summarise(mut_freq = length(DepMap_ID))
# Descriptive analysis of mut_freq
descr(mutation_counts$mut_freq)## Descriptive Statistics
## mutation_counts$mut_freq
## N: 1747
##
## mut_freq
## ----------------- ----------
## Mean 737.43
## Std.Dev 2163.49
## Min 16.00
## Q1 249.00
## Median 367.00
## Q3 577.00
## Max 66665.00
## MAD 207.56
## IQR 326.50
## CV 2.93
## Skewness 19.42
## SE.Skewness 0.06
## Kurtosis 525.69
## N.Valid 1747.00
## N 1747.00
## Pct.Valid 100.00
# Discretize into 3 levels (Low, Medium, High) by frequency
mutation_counts$mut_counts <- discretize(mutation_counts$mut_freq, method = "frequency", breaks = 3, labels = c("Low", "Medium", "High"))
# set row names to DepMap_ID
mutation_counts <- mutation_counts %>% remove_rownames %>% column_to_rownames(var="DepMap_ID")
# inspect structure and distribution
str(mutation_counts)## 'data.frame': 1747 obs. of 2 variables:
## $ mut_freq : int 308 180 207 271 288 220 408 379 438 352 ...
## $ mut_counts: Factor w/ 3 levels "Low","Medium",..: 2 1 1 1 1 1 2 2 2 2 ...
## ..- attr(*, "discretized:breaks")= num [1:4] 16 290 477 66665
## ..- attr(*, "discretized:method")= chr "frequency"
## Frequencies
## mutation_counts$mut_counts
## Type: Factor
##
## Freq % Valid % Valid Cum. % Total % Total Cum.
## ------------ ------ --------- -------------- --------- --------------
## Low 582 33.31 33.31 33.31 33.31
## Medium 581 33.26 66.57 33.26 66.57
## High 584 33.43 100.00 33.43 100.00
## <NA> 0 0.00 100.00
## Total 1747 100.00 100.00 100.00 100.00
# 2nd metadata - Primary or Metastasis
# Create dataframe with cell lines and cancer type (primary or metastasis)
prim_met = data.frame(label = sample_info$DepMap_ID, type = sample_info$primary_or_metastasis)
# Set "type" as factor
prim_met$type <- as.factor(prim_met$type)
# Distribution by cancer type
freq(prim_met$type)## Error in match(x, table, nomatch = 0L): 'match' requires vector arguments
## Warning in parse_call(mc = match.call(), caller = "freq"): metadata extraction
## terminated unexpectedly; inspect results carefully
## Frequencies
##
## Freq % Valid % Valid Cum. % Total % Total Cum.
## ---------------- ------ --------- -------------- --------- --------------
## Metastasis 498 43.42 43.42 27.50 27.50
## Primary 649 56.58 100.00 35.84 63.34
## <NA> 664 36.66 100.00
## Total 1811 100.00 100.00 100.00 100.00
# set row names
prim_met <- prim_met %>% remove_rownames %>% column_to_rownames(var="label")
# Merge mutation counts and cancer type by cell line
metadados <- merge(mutation_counts, prim_met, by=0, all=TRUE)
# Omit rows with NA
metadados_2 <- metadados
metadados_2$type <- NULL
clean_metadados <- na.omit(metadados)
dfSummary(clean_metadados)## Data Frame Summary
## clean_metadados
## Dimensions: 1125 x 4
## Duplicates: 0
##
## --------------------------------------------------------------------------------------------------------------
## No Variable Stats / Values Freqs (% of Valid) Graph Valid Missing
## ---- ------------- -------------------------- --------------------- --------------------- ---------- ---------
## 1 Row.names 1. ACH-000001 1 ( 0.1%) 1125 0
## [character] 2. ACH-000002 1 ( 0.1%) (100.0%) (0.0%)
## 3. ACH-000006 1 ( 0.1%)
## 4. ACH-000007 1 ( 0.1%)
## 5. ACH-000009 1 ( 0.1%)
## 6. ACH-000011 1 ( 0.1%)
## 7. ACH-000012 1 ( 0.1%)
## 8. ACH-000013 1 ( 0.1%)
## 9. ACH-000014 1 ( 0.1%)
## 10. ACH-000015 1 ( 0.1%)
## [ 1115 others ] 1115 (99.1%) IIIIIIIIIIIIIIIIIII
##
## 2 mut_freq Mean (sd) : 812.5 (2313) 672 distinct values : 1125 0
## [integer] min < med < max: : (100.0%) (0.0%)
## 16 < 421 < 66665 :
## IQR (CV) : 356 (2.8) :
## :
##
## 3 mut_counts 1. Low 270 (24.0%) IIII 1125 0
## [factor] 2. Medium 389 (34.6%) IIIIII (100.0%) (0.0%)
## 3. High 466 (41.4%) IIIIIIII
##
## 4 type 1. Metastasis 483 (42.9%) IIIIIIII 1125 0
## [factor] 2. Primary 642 (57.1%) IIIIIIIIIII (100.0%) (0.0%)
## --------------------------------------------------------------------------------------------------------------
# Datasets for differential expression
meta_expression <- raw_counts %>% filter(raw_counts$DepMap_ID %in% clean_metadados$Row.names)
meta_intersect <- clean_metadados %>% filter(clean_metadados$Row.names %in% meta_expression$DepMap_ID)
# Verify intersection
dim(meta_expression)## [1] 1020 52440
## [1] 1020 4
We counted mutations per DepMap_ID to produce mut_freq and discretized into mut_counts with three roughly equally-sized bins: Low, Medium, High. type is taken from primary_or_metastasis and entries with NA were omitted. The resulting clean_metadados and meta_expression were intersected to produce matched expression and metadata for differential analysis.
We explored the metadata with descriptive stats and non-parametric tests where appropriate.
## Descriptive Statistics
## mutation_counts$mut_freq
## N: 1747
##
## mut_freq
## ----------------- ----------
## Mean 737.43
## Std.Dev 2163.49
## Min 16.00
## Q1 249.00
## Median 367.00
## Q3 577.00
## Max 66665.00
## MAD 207.56
## IQR 326.50
## CV 2.93
## Skewness 19.42
## SE.Skewness 0.06
## Kurtosis 525.69
## N.Valid 1747.00
## N 1747.00
## Pct.Valid 100.00
# Density plot
ggdensity(clean_metadados$mut_freq,
main = "Mutation Counts",
xlab = "Number of Mutations")##
## Shapiro-Wilk normality test
##
## data: clean_metadados$mut_freq
## W = 0.18291, p-value < 2.2e-16
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 1 0.1588 0.6903
## 1123
# There is no evidence to suggest variance across groups is significantly different
leveneTest(clean_metadados$mut_freq ~ clean_metadados$mut_counts)## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 22.523 2.57e-10 ***
## 1122
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# There is evidence to suggest variance across groups is significantly different
## Tests on means
# Wilcoxon test for two groups - freq vs type
wilcox.test(clean_metadados$mut_freq ~ clean_metadados$type,
exact = FALSE)##
## Wilcoxon rank sum test with continuity correction
##
## data: clean_metadados$mut_freq by clean_metadados$type
## W = 160234, p-value = 0.3359
## alternative hypothesis: true location shift is not equal to 0
# Box plot - types
p<-ggboxplot(clean_metadados, x = "type", y = "mut_freq",
color = "type", palette = c("#00AFBB", "#E7B800"),
ylab = "Weight", xlab = "Groups")
ggpar(p, ylim = c(0,1000), main = "Cancer Type")# Kruskal-Wallis - freq vs counts
res.krusk <- kruskal.test(clean_metadados$mut_freq ~ clean_metadados$mut_counts)
res.krusk##
## Kruskal-Wallis rank sum test
##
## data: clean_metadados$mut_freq by clean_metadados$mut_counts
## Kruskal-Wallis chi-squared = 982.11, df = 2, p-value < 2.2e-16
##
## Pairwise comparisons using Wilcoxon rank sum test with continuity correction
##
## data: clean_metadados$mut_freq and clean_metadados$mut_counts
##
## Low Medium
## Medium <2e-16 -
## High <2e-16 <2e-16
##
## P value adjustment method: BH
# Box plot - counts
p2<-ggboxplot(clean_metadados, x = "mut_counts", y = "mut_freq",
color = "mut_counts", palette = c("#00AFBB", "#E7B800", "#FC4E07"),
ylab = "Weight", xlab = "Treatment")
ggpar(p2, ylim = c(0,2000), main = "Mutation Counts")mut_freq is not normally distributed (Shapiro), so non-parametric tests were used: Wilcox for two-group comparisons and Kruskal-Wallis for three-group comparisons (followed by pairwise Wilcox with BH correction). Significant differences were observed across mut_counts groups.
meta_expression had DepMap_ID set as rownames, ordered and transposed to build exp_matrix for differential expression.
# Set rownames and transpose expression data
meta_expression <- meta_expression %>% remove_rownames %>% column_to_rownames(var="DepMap_ID")
# Order cell line ids
meta_expression_ordered <- meta_expression[ order(row.names(meta_expression)), ]
# Transpose expression matrix
exp_matrix <- t(meta_expression_ordered)We extracted ENSEMBL IDs from rownames of exp_matrix and used org.Hs.eg.db to map ENTREZID, SYMBOL and GENENAME. Genes without annotation were removed.
# Extract Ensembl ids from rownames
ensembl_ids <- stringr::str_extract(row.names(exp_matrix), "ENSG[0-9]+")
row.names(exp_matrix) <- ensembl_ids
# Use gprofiler2 to map Ensembl IDs to annotation
gmap <- gconvert(query = unique(ensembl_ids), organism = "hsapiens")
# Get valid ENSEMBL IDs from gprofiler2 mapping
valid_ensembl <- gmap$target
# Subset exp_matrix to keep only rows with valid ENSEMBL IDs
exp_matrix <- exp_matrix[rownames(exp_matrix) %in% valid_ensembl, ]We used the limma pipeline for differential expression analysis. First, we filtered out lowly expressed genes.
We used the same metadata as in the previous analysis.
#Design for mutation types
Cancer_type <- meta_intersect$type
# Design for primary vs metastasis
design = model.matrix(~0+Cancer_type, data = y$samples)
colnames(design) <- levels(Cancer_type)
# Make contrasts
contrast <- makeContrasts(Metastasis - Primary, levels=design)
keep <- filterByExpr(y, design)
y <- y[keep, , keep.lib.sizes=FALSE]
# Normalization for composition bias
y <- calcNormFactors(y)
# Limma voom pipeline
v <- voom(y, design, plot = FALSE)
fit <- lmFit(v, design)
fit.cont <- contrasts.fit(fit, contrast)
fit.cont <- eBayes(fit.cont)
# Results
results <- topTable(fit.cont, coef="Metastasis - Primary")Results showed 17,049 genes passed the filter. We found 1,488 underexpressed and 1,815 overexpressed genes in metastasis compared to primary samples.
We performed GO enrichment analysis on the differentially expressed genes.
# GO enrichment using gprofiler2::gost instead of goana
# Select DE genes for enrichment (example: topTable results, or other selection)
de_genes <- rownames(results)[results$adj.P.Val < 0.05] # or other threshold/logic
# Run gprofiler2 enrichment using ENSEMBL IDs
go_res <- gost(query = de_genes, organism = "hsapiens", sources = c("GO:BP", "GO:MF", "GO:CC"))
# Show top results
head(go_res$result, 10)## query significant p_value term_size query_size intersection_size
## 1 query_1 TRUE 0.001449914 74 7 3
## 2 query_1 TRUE 0.001510111 75 7 3
## 3 query_1 TRUE 0.039392389 222 7 3
## precision recall term_id source
## 1 0.4285714 0.04054054 GO:0071260 GO:BP
## 2 0.4285714 0.04000000 GO:0071496 GO:BP
## 3 0.4285714 0.01351351 GO:0009612 GO:BP
## term_name effective_domain_size source_order
## 1 cellular response to mechanical stimulus 21026 16288
## 2 cellular response to external stimulus 21026 16498
## 3 response to mechanical stimulus 21026 3526
## parents
## 1 GO:00096....
## 2 GO:0009605
## 3 GO:00096....
GO analysis showed mainly underexpression of immune response and stimulus response terms for metastasis vs primary.
# Load into edgeR
y <- DGEList(counts = exp_matrix)
# Design for mutation count
Mutation_count <- meta_intersect$mut_counts
design = model.matrix(~0+Mutation_count, data = y$samples)
colnames(design) <- levels(Mutation_count)
design## Low Medium High
## ACH-000001 0 1 0
## ACH-000002 1 0 0
## ACH-000006 1 0 0
## ACH-000007 0 1 0
## ACH-000009 0 1 0
## ACH-000011 0 1 0
## ACH-000012 1 0 0
## ACH-000013 1 0 0
## ACH-000014 0 0 1
## ACH-000015 0 1 0
## ACH-000016 1 0 0
## ACH-000017 0 1 0
## ACH-000018 1 0 0
## ACH-000019 0 0 1
## ACH-000021 0 1 0
## ACH-000022 1 0 0
## ACH-000023 0 1 0
## ACH-000024 0 0 1
## ACH-000026 0 1 0
## ACH-000027 1 0 0
## ACH-000028 0 0 1
## ACH-000029 0 1 0
## ACH-000030 0 0 1
## ACH-000031 1 0 0
## ACH-000033 1 0 0
## ACH-000035 1 0 0
## ACH-000039 1 0 0
## ACH-000040 1 0 0
## ACH-000041 1 0 0
## ACH-000042 1 0 0
## ACH-000043 1 0 0
## ACH-000044 1 0 0
## ACH-000045 1 0 0
## ACH-000046 1 0 0
## ACH-000048 1 0 0
## ACH-000050 0 1 0
## ACH-000051 1 0 0
## ACH-000052 1 0 0
## ACH-000054 1 0 0
## ACH-000055 1 0 0
## ACH-000056 1 0 0
## ACH-000059 1 0 0
## ACH-000060 1 0 0
## ACH-000062 0 1 0
## ACH-000065 1 0 0
## ACH-000066 1 0 0
## ACH-000067 1 0 0
## ACH-000070 1 0 0
## ACH-000073 1 0 0
## ACH-000075 1 0 0
## ACH-000078 1 0 0
## ACH-000082 1 0 0
## ACH-000085 1 0 0
## ACH-000087 1 0 0
## ACH-000089 1 0 0
## ACH-000090 0 0 1
## ACH-000091 1 0 0
## ACH-000092 1 0 0
## ACH-000093 1 0 0
## ACH-000094 1 0 0
## ACH-000096 1 0 0
## ACH-000097 0 1 0
## ACH-000098 1 0 0
## ACH-000100 1 0 0
## ACH-000102 1 0 0
## ACH-000103 1 0 0
## ACH-000105 1 0 0
## ACH-000107 1 0 0
## ACH-000108 0 1 0
## ACH-000109 0 1 0
## ACH-000111 1 0 0
## ACH-000113 1 0 0
## ACH-000114 1 0 0
## ACH-000115 1 0 0
## ACH-000116 1 0 0
## ACH-000117 1 0 0
## ACH-000118 0 1 0
## ACH-000121 1 0 0
## ACH-000123 1 0 0
## ACH-000124 1 0 0
## ACH-000125 1 0 0
## ACH-000126 0 1 0
## ACH-000127 0 1 0
## ACH-000129 1 0 0
## ACH-000131 1 0 0
## ACH-000132 0 1 0
## ACH-000133 1 0 0
## ACH-000135 1 0 0
## ACH-000136 1 0 0
## ACH-000137 1 0 0
## ACH-000138 1 0 0
## ACH-000139 1 0 0
## ACH-000140 1 0 0
## ACH-000142 1 0 0
## ACH-000144 0 1 0
## ACH-000145 1 0 0
## ACH-000146 0 1 0
## ACH-000147 1 0 0
## ACH-000148 1 0 0
## ACH-000149 1 0 0
## ACH-000150 1 0 0
## ACH-000152 1 0 0
## ACH-000153 0 1 0
## ACH-000155 1 0 0
## ACH-000157 0 0 1
## ACH-000158 0 1 0
## ACH-000159 0 1 0
## ACH-000161 1 0 0
## ACH-000163 1 0 0
## ACH-000164 1 0 0
## ACH-000166 0 1 0
## ACH-000167 0 1 0
## ACH-000168 0 1 0
## ACH-000169 0 1 0
## ACH-000171 1 0 0
## ACH-000172 1 0 0
## ACH-000174 1 0 0
## ACH-000176 1 0 0
## ACH-000177 0 1 0
## ACH-000178 1 0 0
## ACH-000179 0 1 0
## ACH-000181 0 1 0
## ACH-000183 0 1 0
## ACH-000186 0 1 0
## ACH-000187 1 0 0
## ACH-000188 1 0 0
## ACH-000189 0 1 0
## ACH-000191 1 0 0
## ACH-000192 0 1 0
## ACH-000193 1 0 0
## ACH-000194 1 0 0
## ACH-000196 0 0 1
## ACH-000197 0 1 0
## ACH-000200 0 1 0
## ACH-000201 1 0 0
## ACH-000202 0 1 0
## ACH-000203 0 1 0
## ACH-000204 0 1 0
## ACH-000205 0 1 0
## ACH-000207 0 1 0
## ACH-000210 0 1 0
## ACH-000211 0 1 0
## ACH-000212 1 0 0
## ACH-000213 0 1 0
## ACH-000217 0 1 0
## ACH-000219 0 0 1
## ACH-000221 0 1 0
## ACH-000222 0 1 0
## ACH-000223 0 1 0
## ACH-000227 0 1 0
## ACH-000228 1 0 0
## ACH-000229 1 0 0
## ACH-000231 0 1 0
## ACH-000232 0 1 0
## ACH-000234 1 0 0
## ACH-000235 0 1 0
## ACH-000236 0 1 0
## ACH-000237 0 1 0
## ACH-000238 0 1 0
## ACH-000239 0 1 0
## ACH-000240 1 0 0
## ACH-000242 1 0 0
## ACH-000244 1 0 0
## ACH-000246 0 1 0
## ACH-000247 0 1 0
## ACH-000248 0 0 1
## ACH-000249 1 0 0
## ACH-000250 0 1 0
## ACH-000252 0 1 0
## ACH-000253 0 1 0
## ACH-000255 0 1 0
## ACH-000256 1 0 0
## ACH-000257 0 1 0
## ACH-000258 0 1 0
## ACH-000259 0 1 0
## ACH-000260 1 0 0
## ACH-000261 0 1 0
## ACH-000263 0 1 0
## ACH-000264 0 1 0
## ACH-000265 0 1 0
## ACH-000269 0 1 0
## ACH-000270 0 1 0
## ACH-000271 1 0 0
## ACH-000272 1 0 0
## ACH-000273 1 0 0
## ACH-000275 1 0 0
## ACH-000276 0 1 0
## ACH-000277 0 1 0
## ACH-000278 0 1 0
## ACH-000280 0 1 0
## ACH-000281 0 1 0
## ACH-000282 0 1 0
## ACH-000285 0 1 0
## ACH-000286 0 1 0
## ACH-000288 1 0 0
## ACH-000290 0 1 0
## ACH-000291 1 0 0
## ACH-000292 0 1 0
## ACH-000293 1 0 0
## ACH-000294 0 1 0
## ACH-000296 0 0 1
## ACH-000297 0 1 0
## ACH-000298 0 1 0
## ACH-000300 1 0 0
## ACH-000301 0 1 0
## ACH-000302 0 1 0
## ACH-000303 0 1 0
## ACH-000304 0 0 1
## ACH-000306 1 0 0
## ACH-000307 0 1 0
## ACH-000308 0 1 0
## ACH-000310 1 0 0
## ACH-000311 0 1 0
## ACH-000312 0 1 0
## ACH-000313 0 1 0
## ACH-000314 0 0 1
## ACH-000315 0 1 0
## ACH-000317 0 1 0
## ACH-000318 0 1 0
## ACH-000320 0 1 0
## ACH-000322 0 1 0
## ACH-000323 1 0 0
## ACH-000324 0 1 0
## ACH-000325 0 1 0
## ACH-000327 1 0 0
## ACH-000329 1 0 0
## ACH-000330 0 1 0
## ACH-000332 0 1 0
## ACH-000333 1 0 0
## ACH-000334 0 1 0
## ACH-000335 1 0 0
## ACH-000336 1 0 0
## ACH-000338 0 1 0
## ACH-000341 0 1 0
## ACH-000343 0 1 0
## ACH-000344 0 1 0
## ACH-000345 0 1 0
## ACH-000347 0 0 1
## ACH-000348 0 1 0
## ACH-000349 0 1 0
## ACH-000350 0 0 1
## ACH-000351 0 1 0
## ACH-000352 0 1 0
## ACH-000353 0 0 1
## ACH-000354 0 1 0
## ACH-000355 0 1 0
## ACH-000356 0 1 0
## ACH-000358 0 1 0
## ACH-000359 1 0 0
## ACH-000360 0 1 0
## ACH-000361 1 0 0
## ACH-000362 0 1 0
## ACH-000363 0 1 0
## ACH-000364 0 1 0
## ACH-000365 0 1 0
## ACH-000366 0 1 0
## ACH-000367 1 0 0
## ACH-000368 0 1 0
## ACH-000373 0 1 0
## ACH-000374 1 0 0
## ACH-000375 0 1 0
## ACH-000376 0 1 0
## ACH-000378 0 1 0
## ACH-000379 0 1 0
## ACH-000380 0 1 0
## ACH-000381 0 0 1
## ACH-000382 0 1 0
## ACH-000383 0 1 0
## ACH-000384 0 1 0
## ACH-000386 0 1 0
## ACH-000388 0 1 0
## ACH-000389 0 1 0
## ACH-000390 0 1 0
## ACH-000391 0 1 0
## ACH-000392 0 1 0
## ACH-000393 0 1 0
## ACH-000394 0 0 1
## ACH-000395 0 0 1
## ACH-000396 0 1 0
## ACH-000397 0 1 0
## ACH-000399 0 1 0
## ACH-000400 0 1 0
## ACH-000401 0 1 0
## ACH-000403 1 0 0
## ACH-000404 0 1 0
## ACH-000406 0 1 0
## ACH-000407 0 1 0
## ACH-000409 0 1 0
## ACH-000410 0 1 0
## ACH-000411 1 0 0
## ACH-000412 0 1 0
## ACH-000414 0 0 1
## ACH-000415 1 0 0
## ACH-000416 0 0 1
## ACH-000417 0 1 0
## ACH-000418 1 0 0
## ACH-000419 0 0 1
## ACH-000420 0 1 0
## ACH-000421 0 1 0
## ACH-000422 0 1 0
## ACH-000423 0 1 0
## ACH-000424 0 1 0
## ACH-000427 0 1 0
## ACH-000429 0 0 1
## ACH-000430 0 1 0
## ACH-000431 0 0 1
## ACH-000433 1 0 0
## ACH-000434 0 0 1
## ACH-000435 1 0 0
## ACH-000437 0 1 0
## ACH-000438 0 1 0
## ACH-000441 0 1 0
## ACH-000442 0 1 0
## ACH-000443 0 1 0
## ACH-000444 0 0 1
## ACH-000445 0 1 0
## ACH-000446 0 1 0
## ACH-000447 0 1 0
## ACH-000448 0 0 1
## ACH-000449 0 1 0
## ACH-000450 0 1 0
## ACH-000451 0 1 0
## ACH-000453 0 1 0
## ACH-000454 0 1 0
## ACH-000456 0 1 0
## ACH-000457 0 1 0
## ACH-000458 0 1 0
## ACH-000459 0 1 0
## ACH-000460 0 1 0
## ACH-000463 0 0 1
## ACH-000464 0 1 0
## ACH-000465 0 1 0
## ACH-000466 0 1 0
## ACH-000468 0 1 0
## ACH-000469 0 1 0
## ACH-000470 0 0 1
## ACH-000472 0 0 1
## ACH-000473 0 0 1
## ACH-000475 0 0 1
## ACH-000476 0 1 0
## ACH-000477 0 1 0
## ACH-000478 0 1 0
## ACH-000479 0 1 0
## ACH-000480 0 1 0
## ACH-000481 0 0 1
## ACH-000482 0 1 0
## ACH-000483 0 1 0
## ACH-000484 1 0 0
## ACH-000485 0 1 0
## ACH-000487 0 1 0
## ACH-000488 0 0 1
## ACH-000489 0 1 0
## ACH-000490 0 1 0
## ACH-000491 0 0 1
## ACH-000493 0 1 0
## ACH-000495 0 1 0
## ACH-000496 0 1 0
## ACH-000501 0 0 1
## ACH-000502 1 0 0
## ACH-000503 0 1 0
## ACH-000504 0 1 0
## ACH-000505 0 1 0
## ACH-000506 0 1 0
## ACH-000507 0 0 1
## ACH-000508 0 0 1
## ACH-000510 0 0 1
## ACH-000511 0 0 1
## ACH-000514 0 0 1
## ACH-000515 0 0 1
## ACH-000517 0 1 0
## ACH-000518 0 1 0
## ACH-000520 0 1 0
## ACH-000521 0 0 1
## ACH-000522 0 1 0
## ACH-000523 0 0 1
## ACH-000524 0 1 0
## ACH-000525 0 1 0
## ACH-000526 1 0 0
## ACH-000527 0 0 1
## ACH-000528 0 0 1
## ACH-000530 0 1 0
## ACH-000532 0 1 0
## ACH-000534 0 1 0
## ACH-000535 1 0 0
## ACH-000536 0 0 1
## ACH-000538 0 1 0
## ACH-000539 1 0 0
## ACH-000541 0 0 1
## ACH-000542 0 1 0
## ACH-000544 0 1 0
## ACH-000545 0 0 1
## ACH-000546 0 0 1
## ACH-000547 0 1 0
## ACH-000548 0 0 1
## ACH-000550 0 0 1
## ACH-000551 0 0 1
## ACH-000552 0 0 1
## ACH-000553 0 0 1
## ACH-000554 0 0 1
## ACH-000555 1 0 0
## ACH-000557 1 0 0
## ACH-000558 0 1 0
## ACH-000559 0 0 1
## ACH-000561 0 0 1
## ACH-000562 0 1 0
## ACH-000563 0 0 1
## ACH-000564 0 0 1
## ACH-000565 0 0 1
## ACH-000566 0 1 0
## ACH-000568 0 1 0
## ACH-000569 1 0 0
## ACH-000570 0 0 1
## ACH-000571 0 1 0
## ACH-000572 0 1 0
## ACH-000573 0 1 0
## ACH-000574 0 0 1
## ACH-000576 0 0 1
## ACH-000577 0 1 0
## ACH-000578 0 1 0
## ACH-000579 0 0 1
## ACH-000580 0 1 0
## ACH-000581 0 1 0
## ACH-000582 0 0 1
## ACH-000584 0 0 1
## ACH-000585 0 0 1
## ACH-000586 0 0 1
## ACH-000587 0 1 0
## ACH-000588 0 0 1
## ACH-000589 0 0 1
## ACH-000590 0 0 1
## ACH-000593 0 1 0
## ACH-000594 0 0 1
## ACH-000595 0 0 1
## ACH-000596 0 0 1
## ACH-000598 0 0 1
## ACH-000599 0 1 0
## ACH-000600 1 0 0
## ACH-000601 0 1 0
## ACH-000603 0 0 1
## ACH-000606 1 0 0
## ACH-000607 0 0 1
## ACH-000608 0 1 0
## ACH-000609 0 1 0
## ACH-000610 0 0 1
## ACH-000611 0 0 1
## ACH-000613 1 0 0
## ACH-000614 0 0 1
## ACH-000616 0 0 1
## ACH-000617 0 1 0
## ACH-000619 0 1 0
## ACH-000620 0 0 1
## ACH-000621 0 0 1
## ACH-000622 0 0 1
## ACH-000623 0 1 0
## ACH-000624 0 1 0
## ACH-000625 0 1 0
## ACH-000627 0 0 1
## ACH-000628 0 0 1
## ACH-000631 0 0 1
## ACH-000632 0 0 1
## ACH-000633 0 0 1
## ACH-000635 0 0 1
## ACH-000637 0 0 1
## ACH-000638 0 0 1
## ACH-000639 0 0 1
## ACH-000640 0 0 1
## ACH-000643 0 0 1
## ACH-000644 0 1 0
## ACH-000645 1 0 0
## ACH-000646 0 0 1
## ACH-000647 0 0 1
## ACH-000648 0 1 0
## ACH-000649 0 0 1
## ACH-000650 0 0 1
## ACH-000651 0 0 1
## ACH-000652 0 0 1
## ACH-000653 0 1 0
## ACH-000655 0 1 0
## ACH-000656 0 0 1
## ACH-000657 0 0 1
## ACH-000658 0 0 1
## ACH-000659 0 0 1
## ACH-000660 0 1 0
## ACH-000662 0 0 1
## ACH-000663 0 0 1
## ACH-000665 0 0 1
## ACH-000666 0 0 1
## ACH-000667 0 0 1
## ACH-000668 0 1 0
## ACH-000669 0 0 1
## ACH-000670 0 0 1
## ACH-000672 0 0 1
## ACH-000674 0 0 1
## ACH-000675 0 0 1
## ACH-000677 0 1 0
## ACH-000678 0 0 1
## ACH-000679 0 0 1
## ACH-000680 0 0 1
## ACH-000681 0 0 1
## ACH-000683 0 0 1
## ACH-000684 0 1 0
## ACH-000685 0 1 0
## ACH-000686 0 1 0
## ACH-000688 1 0 0
## ACH-000691 0 0 1
## ACH-000692 0 1 0
## ACH-000693 0 0 1
## ACH-000694 0 0 1
## ACH-000695 0 0 1
## ACH-000696 0 0 1
## ACH-000697 0 0 1
## ACH-000698 0 0 1
## ACH-000699 0 0 1
## ACH-000701 0 1 0
## ACH-000703 0 0 1
## ACH-000704 0 1 0
## ACH-000705 0 0 1
## ACH-000706 0 0 1
## ACH-000708 0 1 0
## ACH-000709 0 1 0
## ACH-000710 1 0 0
## ACH-000711 1 0 0
## ACH-000712 0 1 0
## ACH-000713 0 1 0
## ACH-000714 0 0 1
## ACH-000718 0 0 1
## ACH-000719 0 1 0
## ACH-000720 0 1 0
## ACH-000721 0 0 1
## ACH-000722 0 0 1
## ACH-000724 0 1 0
## ACH-000725 0 0 1
## ACH-000729 0 0 1
## ACH-000730 0 0 1
## ACH-000731 1 0 0
## ACH-000732 1 0 0
## ACH-000733 0 0 1
## ACH-000734 0 0 1
## ACH-000735 0 1 0
## ACH-000736 0 0 1
## ACH-000737 0 1 0
## ACH-000738 0 0 1
## ACH-000739 0 1 0
## ACH-000740 0 0 1
## ACH-000741 0 1 0
## ACH-000743 0 0 1
## ACH-000744 0 0 1
## ACH-000745 0 0 1
## ACH-000746 0 0 1
## ACH-000747 0 0 1
## ACH-000748 0 1 0
## ACH-000749 0 0 1
## ACH-000750 0 0 1
## ACH-000752 0 0 1
## ACH-000753 0 0 1
## ACH-000755 0 1 0
## ACH-000756 0 0 1
## ACH-000758 0 0 1
## ACH-000759 0 0 1
## ACH-000761 0 1 0
## ACH-000763 0 0 1
## ACH-000764 0 0 1
## ACH-000765 0 1 0
## ACH-000766 0 0 1
## ACH-000767 0 0 1
## ACH-000768 0 0 1
## ACH-000769 0 0 1
## ACH-000771 0 1 0
## ACH-000774 0 0 1
## ACH-000775 0 0 1
## ACH-000776 0 0 1
## ACH-000777 0 0 1
## ACH-000778 0 0 1
## ACH-000780 0 0 1
## ACH-000781 0 0 1
## ACH-000782 0 0 1
## ACH-000783 0 0 1
## ACH-000784 0 0 1
## ACH-000785 0 0 1
## ACH-000787 0 0 1
## ACH-000788 0 0 1
## ACH-000789 0 0 1
## ACH-000790 0 0 1
## ACH-000791 0 0 1
## ACH-000792 0 1 0
## ACH-000793 0 0 1
## ACH-000794 0 1 0
## ACH-000796 0 0 1
## ACH-000797 0 1 0
## ACH-000798 0 1 0
## ACH-000799 0 0 1
## ACH-000800 0 0 1
## ACH-000802 0 0 1
## ACH-000803 0 0 1
## ACH-000804 0 0 1
## ACH-000805 0 0 1
## ACH-000808 0 0 1
## ACH-000809 0 0 1
## ACH-000810 0 0 1
## ACH-000811 0 0 1
## ACH-000812 0 0 1
## ACH-000813 0 0 1
## ACH-000815 0 0 1
## ACH-000816 0 0 1
## ACH-000817 0 0 1
## ACH-000818 0 0 1
## ACH-000819 0 0 1
## ACH-000820 0 0 1
## ACH-000821 0 0 1
## ACH-000822 0 0 1
## ACH-000823 0 0 1
## ACH-000824 0 0 1
## ACH-000825 0 0 1
## ACH-000826 0 0 1
## ACH-000828 0 0 1
## ACH-000830 0 0 1
## ACH-000831 0 0 1
## ACH-000832 0 0 1
## ACH-000833 0 0 1
## ACH-000834 0 1 0
## ACH-000835 0 0 1
## ACH-000837 0 0 1
## ACH-000838 0 0 1
## ACH-000839 0 0 1
## ACH-000840 0 0 1
## ACH-000841 0 0 1
## ACH-000842 0 0 1
## ACH-000843 0 0 1
## ACH-000844 0 0 1
## ACH-000845 0 0 1
## ACH-000846 0 0 1
## ACH-000847 0 0 1
## ACH-000848 0 0 1
## ACH-000849 0 0 1
## ACH-000850 1 0 0
## ACH-000851 0 0 1
## ACH-000852 0 0 1
## ACH-000853 0 0 1
## ACH-000855 0 0 1
## ACH-000856 0 0 1
## ACH-000857 0 1 0
## ACH-000858 0 0 1
## ACH-000859 0 0 1
## ACH-000860 0 0 1
## ACH-000861 0 0 1
## ACH-000862 0 0 1
## ACH-000863 0 0 1
## ACH-000864 0 0 1
## ACH-000865 0 0 1
## ACH-000866 0 0 1
## ACH-000867 0 0 1
## ACH-000868 0 0 1
## ACH-000869 0 0 1
## ACH-000870 0 0 1
## ACH-000871 0 0 1
## ACH-000873 0 0 1
## ACH-000874 0 0 1
## ACH-000875 0 0 1
## ACH-000876 0 0 1
## ACH-000877 0 0 1
## ACH-000878 0 0 1
## ACH-000879 0 0 1
## ACH-000880 0 0 1
## ACH-000881 0 0 1
## ACH-000882 0 0 1
## ACH-000883 0 0 1
## ACH-000884 0 0 1
## ACH-000885 0 0 1
## ACH-000886 0 0 1
## ACH-000888 0 0 1
## ACH-000889 0 0 1
## ACH-000890 0 0 1
## ACH-000891 0 0 1
## ACH-000892 0 0 1
## ACH-000893 0 0 1
## ACH-000894 0 0 1
## ACH-000895 0 0 1
## ACH-000896 0 0 1
## ACH-000897 0 0 1
## ACH-000898 0 0 1
## ACH-000899 0 0 1
## ACH-000900 0 0 1
## ACH-000901 0 0 1
## ACH-000902 0 0 1
## ACH-000903 0 0 1
## ACH-000904 0 0 1
## ACH-000906 0 0 1
## ACH-000907 0 0 1
## ACH-000908 0 0 1
## ACH-000909 0 0 1
## ACH-000910 0 0 1
## ACH-000911 0 0 1
## ACH-000912 0 0 1
## ACH-000913 0 0 1
## ACH-000915 0 0 1
## ACH-000916 0 0 1
## ACH-000919 0 0 1
## ACH-000921 0 0 1
## ACH-000924 0 0 1
## ACH-000925 0 0 1
## ACH-000926 0 0 1
## ACH-000927 0 0 1
## ACH-000928 0 0 1
## ACH-000929 0 0 1
## ACH-000930 0 0 1
## ACH-000931 0 0 1
## ACH-000932 0 0 1
## ACH-000934 0 0 1
## ACH-000936 0 0 1
## ACH-000938 0 0 1
## ACH-000939 0 0 1
## ACH-000940 0 0 1
## ACH-000941 0 0 1
## ACH-000942 0 0 1
## ACH-000943 0 0 1
## ACH-000945 0 0 1
## ACH-000946 0 0 1
## ACH-000947 0 0 1
## ACH-000948 0 0 1
## ACH-000949 0 0 1
## ACH-000950 0 0 1
## ACH-000951 0 0 1
## ACH-000952 0 0 1
## ACH-000953 0 0 1
## ACH-000954 0 0 1
## ACH-000955 0 0 1
## ACH-000956 0 0 1
## ACH-000957 0 0 1
## ACH-000958 0 0 1
## ACH-000960 0 0 1
## ACH-000961 0 0 1
## ACH-000962 0 0 1
## ACH-000963 0 0 1
## ACH-000965 0 0 1
## ACH-000966 0 0 1
## ACH-000967 0 0 1
## ACH-000968 0 0 1
## ACH-000969 0 0 1
## ACH-000971 0 0 1
## ACH-000972 0 0 1
## ACH-000973 0 0 1
## ACH-000974 0 0 1
## ACH-000976 0 0 1
## ACH-000977 0 0 1
## ACH-000978 0 0 1
## ACH-000980 0 0 1
## ACH-000981 0 0 1
## ACH-000982 0 0 1
## ACH-000984 0 0 1
## ACH-000985 0 0 1
## ACH-000987 0 0 1
## ACH-000989 0 0 1
## ACH-000990 0 0 1
## ACH-000993 0 0 1
## ACH-000994 0 0 1
## ACH-000995 0 0 1
## ACH-000996 0 0 1
## ACH-000997 0 0 1
## ACH-000998 0 0 1
## ACH-001001 0 1 0
## ACH-001041 1 0 0
## ACH-001048 1 0 0
## ACH-001061 0 0 1
## ACH-001075 0 1 0
## ACH-001078 0 1 0
## ACH-001106 0 1 0
## ACH-001113 0 0 1
## ACH-001129 1 0 0
## ACH-001145 0 0 1
## ACH-001151 1 0 0
## ACH-001163 1 0 0
## ACH-001184 0 0 1
## ACH-001190 0 0 1
## ACH-001192 1 0 0
## ACH-001194 1 0 0
## ACH-001200 0 1 0
## ACH-001210 1 0 0
## ACH-001229 1 0 0
## ACH-001239 0 0 1
## ACH-001277 0 1 0
## ACH-001278 1 0 0
## ACH-001283 1 0 0
## ACH-001306 0 1 0
## ACH-001307 0 1 0
## ACH-001318 0 0 1
## ACH-001321 0 1 0
## ACH-001328 0 0 1
## ACH-001329 0 1 0
## ACH-001332 1 0 0
## ACH-001333 0 0 1
## ACH-001334 0 1 0
## ACH-001335 0 1 0
## ACH-001336 0 0 1
## ACH-001339 0 1 0
## ACH-001340 1 0 0
## ACH-001341 0 1 0
## ACH-001344 0 0 1
## ACH-001345 0 0 1
## ACH-001346 1 0 0
## ACH-001347 1 0 0
## ACH-001353 1 0 0
## ACH-001354 1 0 0
## ACH-001356 1 0 0
## ACH-001360 1 0 0
## ACH-001366 0 0 1
## ACH-001367 0 0 1
## ACH-001368 1 0 0
## ACH-001369 0 0 1
## ACH-001370 1 0 0
## ACH-001373 1 0 0
## ACH-001374 0 1 0
## ACH-001375 1 0 0
## ACH-001376 1 0 0
## ACH-001377 0 0 1
## ACH-001378 1 0 0
## ACH-001379 1 0 0
## ACH-001380 1 0 0
## ACH-001382 1 0 0
## ACH-001384 1 0 0
## ACH-001385 1 0 0
## ACH-001386 0 0 1
## ACH-001388 1 0 0
## ACH-001389 0 1 0
## ACH-001390 0 0 1
## ACH-001391 1 0 0
## ACH-001392 1 0 0
## ACH-001394 1 0 0
## ACH-001395 0 1 0
## ACH-001396 1 0 0
## ACH-001398 0 0 1
## ACH-001399 0 1 0
## ACH-001400 0 1 0
## ACH-001401 0 0 1
## ACH-001402 0 0 1
## ACH-001403 1 0 0
## ACH-001407 0 0 1
## ACH-001408 0 1 0
## ACH-001409 0 1 0
## ACH-001410 1 0 0
## ACH-001411 0 1 0
## ACH-001412 0 0 1
## ACH-001413 0 0 1
## ACH-001414 0 0 1
## ACH-001415 0 0 1
## ACH-001416 0 0 1
## ACH-001418 1 0 0
## ACH-001419 1 0 0
## ACH-001421 1 0 0
## ACH-001422 1 0 0
## ACH-001433 1 0 0
## ACH-001441 1 0 0
## ACH-001442 0 1 0
## ACH-001443 0 0 1
## ACH-001450 1 0 0
## ACH-001451 1 0 0
## ACH-001453 1 0 0
## ACH-001454 1 0 0
## ACH-001456 1 0 0
## ACH-001458 0 1 0
## ACH-001459 1 0 0
## ACH-001460 1 0 0
## ACH-001461 1 0 0
## ACH-001484 0 0 1
## ACH-001485 0 0 1
## ACH-001494 0 1 0
## ACH-001495 1 0 0
## ACH-001496 0 1 0
## ACH-001497 0 1 0
## ACH-001498 0 0 1
## ACH-001500 0 1 0
## ACH-001509 1 0 0
## ACH-001510 0 1 0
## ACH-001511 1 0 0
## ACH-001513 0 0 1
## ACH-001515 0 0 1
## ACH-001516 0 0 1
## ACH-001517 0 0 1
## ACH-001518 0 0 1
## ACH-001519 0 1 0
## ACH-001520 1 0 0
## ACH-001521 0 0 1
## ACH-001522 0 1 0
## ACH-001523 0 0 1
## ACH-001524 0 0 1
## ACH-001525 0 0 1
## ACH-001526 0 1 0
## ACH-001528 0 0 1
## ACH-001529 0 0 1
## ACH-001530 0 0 1
## ACH-001532 1 0 0
## ACH-001536 0 0 1
## ACH-001538 0 1 0
## ACH-001539 0 0 1
## ACH-001540 0 1 0
## ACH-001541 0 1 0
## ACH-001542 0 0 1
## ACH-001543 0 1 0
## ACH-001548 1 0 0
## ACH-001549 0 0 1
## ACH-001550 0 0 1
## ACH-001551 0 0 1
## ACH-001552 0 0 1
## ACH-001554 1 0 0
## ACH-001555 1 0 0
## ACH-001556 1 0 0
## ACH-001557 1 0 0
## ACH-001558 1 0 0
## ACH-001559 1 0 0
## ACH-001560 1 0 0
## ACH-001561 1 0 0
## ACH-001562 1 0 0
## ACH-001563 0 0 1
## ACH-001566 0 0 1
## ACH-001567 0 0 1
## ACH-001568 0 0 1
## ACH-001569 0 0 1
## ACH-001570 0 0 1
## ACH-001573 1 0 0
## ACH-001574 1 0 0
## ACH-001577 1 0 0
## ACH-001578 0 1 0
## ACH-001603 0 1 0
## ACH-001605 0 1 0
## ACH-001607 0 1 0
## ACH-001608 0 0 1
## ACH-001609 0 1 0
## ACH-001610 0 0 1
## ACH-001611 0 1 0
## ACH-001613 1 0 0
## ACH-001616 1 0 0
## ACH-001617 1 0 0
## ACH-001618 1 0 0
## ACH-001619 0 1 0
## ACH-001622 0 1 0
## ACH-001623 0 0 1
## ACH-001624 0 1 0
## ACH-001625 0 0 1
## ACH-001626 0 1 0
## ACH-001627 1 0 0
## ACH-001628 1 0 0
## ACH-001630 0 1 0
## ACH-001632 1 0 0
## ACH-001634 1 0 0
## ACH-001636 1 0 0
## ACH-001638 0 0 1
## ACH-001642 0 1 0
## ACH-001645 0 0 1
## ACH-001647 0 1 0
## ACH-001648 1 0 0
## ACH-001649 1 0 0
## ACH-001650 0 0 1
## ACH-001651 1 0 0
## ACH-001652 0 1 0
## ACH-001653 0 0 1
## ACH-001654 1 0 0
## ACH-001655 0 1 0
## ACH-001656 0 1 0
## ACH-001664 0 0 1
## ACH-001668 1 0 0
## ACH-001670 0 1 0
## ACH-001673 0 1 0
## ACH-001674 0 0 1
## ACH-001677 1 0 0
## ACH-001685 0 1 0
## ACH-001687 0 1 0
## ACH-001688 0 1 0
## ACH-001690 1 0 0
## ACH-001692 1 0 0
## ACH-001694 0 0 1
## ACH-001698 1 0 0
## ACH-001699 0 1 0
## ACH-001702 1 0 0
## ACH-001703 1 0 0
## ACH-001709 0 0 1
## ACH-001711 1 0 0
## ACH-001719 0 0 1
## ACH-001740 0 1 0
## ACH-001765 1 0 0
## ACH-001786 0 0 1
## ACH-001794 1 0 0
## ACH-001807 1 0 0
## ACH-001814 1 0 0
## ACH-001818 1 0 0
## ACH-001819 0 1 0
## ACH-001820 0 1 0
## ACH-001850 0 1 0
## ACH-001861 0 1 0
## ACH-001961 1 0 0
## ACH-001991 0 0 1
## ACH-002011 1 0 0
## ACH-002015 1 0 0
## ACH-002016 1 0 0
## ACH-002017 0 1 0
## ACH-002018 1 0 0
## ACH-002019 0 1 0
## ACH-002022 0 0 1
## ACH-002023 0 1 0
## ACH-002024 0 0 1
## ACH-002025 0 1 0
## ACH-002026 0 0 1
## ACH-002027 0 0 1
## ACH-002029 0 0 1
## ACH-002038 0 1 0
## ACH-002039 0 1 0
## ACH-002041 0 0 1
## ACH-002042 0 1 0
## ACH-002044 0 1 0
## ACH-002045 0 0 1
## ACH-002046 0 1 0
## ACH-002059 0 0 1
## ACH-002062 0 0 1
## ACH-002065 0 1 0
## ACH-002066 0 1 0
## ACH-002067 0 1 0
## ACH-002069 0 1 0
## ACH-002446 0 1 0
## ACH-002508 0 0 1
## ACH-002509 0 0 1
## ACH-002510 0 0 1
## ACH-002511 1 0 0
## attr(,"assign")
## [1] 1 1 1
## attr(,"contrasts")
## attr(,"contrasts")$Mutation_count
## [1] "contr.treatment"
# Make contrasts
contrast <- makeContrasts(MvsL=Medium-Low,
HvsM=High-Medium,
HvsL=High-Low
,levels=design)
keep <- filterByExpr(y, design)
y <- y[keep, , keep.lib.sizes=FALSE]
barplot(y$samples$lib.size,names=colnames(y),las=2)
# Add a title to the plot
title("Barplot of library sizes")# Normalization for composition bias
y <- calcNormFactors(y)
# Start limma voom pipeline
par(mfrow=c(1,1))
v <- voom(y,design,plot = TRUE)# Limma - Trend
fit <- lmFit(v, design)
fit.cont <- contrasts.fit(fit, contrast)
fit.cont <- eBayes(fit.cont, trend = F, robust = F)
summa.fit <- decideTests(fit.cont)
summary(summa.fit)## MvsL HvsM HvsL
## Down 908 902 3964
## NotSig 19843 20543 13636
## Up 1461 767 4612
## logFC AveExpr t P.Value adj.P.Val
## ENSG00000224271 2.510643 -3.3884166903 10.473678 1.882903e-24 4.182304e-20
## ENSG00000221867 3.845289 0.5383173374 9.534024 1.067607e-20 1.185684e-16
## ENSG00000197172 3.699421 0.0591463471 9.451578 2.208600e-20 1.635247e-16
## ENSG00000279712 2.071435 -4.1058346623 9.385180 3.951306e-20 2.194160e-16
## ENSG00000132274 -2.491091 1.6510253853 -9.163649 2.685327e-19 1.192930e-15
## ENSG00000154175 -2.179250 -0.0008381123 -9.032198 8.223786e-19 3.044446e-15
## ENSG00000172638 -1.789188 2.2265786145 -8.930388 1.938837e-18 5.383182e-15
## ENSG00000268606 3.451761 -1.4222605719 8.950203 1.641822e-18 5.209735e-15
## ENSG00000282944 1.910403 -4.9031569178 8.820372 4.853819e-18 1.197923e-14
## ENSG00000205634 1.794872 -4.7852617419 8.800423 5.726758e-18 1.272027e-14
## B
## ENSG00000224271 44.15356
## ENSG00000221867 36.06562
## ENSG00000197172 35.33394
## ENSG00000279712 34.58861
## ENSG00000132274 33.06661
## ENSG00000154175 31.89268
## ENSG00000172638 31.14717
## ENSG00000268606 31.11845
## ENSG00000282944 29.96770
## ENSG00000205634 29.81070
# Genome-wide plots
plotMD(fit.cont,coef="HvsL",status=summa.fit[,"HvsL"], values = c(-1, 1), hl.col=c("blue","red"), main = "HvsL")glXYPlot(x=fit.cont$coefficients[,1], y=fit.cont$lods[,1],
xlab="logFC", ylab="B", main="HvsL",
counts=v$E, groups=Mutation_count, status=summa.fit[,1],
anno=fit.cont$genes, side.main="ENSEMBL", folder="mut_counts_volcano", html = 'mut_counts_volcano')
# Testing relative to a threshold - TREAT
fit.treat <- treat(fit.cont,lfc=log2(1.2))
res.treat <- decideTests(fit.treat)
summary(res.treat)## MvsL HvsM HvsL
## Down 1 20 875
## NotSig 22203 22175 20411
## Up 8 17 926
result_treat <- topTreat(fit.treat, coef="HvsL", number = Inf)
# Treat plots
plotMD(fit.treat,coef="HvsL",status=res.treat[,"HvsL"], values = c(-1, 1), hl.col=c("blue","red"), main = "HvsL (TREAT)")glMDPlot(fit.treat, coef="HvsL", counts=v$E, groups=Mutation_count,
status=res.treat, side.main="ENSEMBL", main="HvsL",
folder="mut_counts_md", html = 'mut_counts_md')## Warning in makeAnno(x, anno): No gene annotation provided.
# Extract ENSEMBL IDs from DE gene rownames
de_genes_raw <- rownames(result_treat)[result_treat$adj.P.Val < 0.05]
de_genes <- stringr::str_extract(de_genes_raw, "ENSG[0-9]+")
de_genes <- na.omit(de_genes)
# Run gprofiler2 enrichment using ENSEMBL IDs
go_res <- gost(query = de_genes, organism = "hsapiens", sources = c("GO:BP", "GO:MF", "GO:CC"))
# Show top results
head(go_res$result, 10)## query significant p_value term_size query_size intersection_size
## 1 query_1 TRUE 9.837452e-30 4053 1083 371
## 2 query_1 TRUE 4.349511e-29 1534 1083 193
## 3 query_1 TRUE 5.166588e-29 5997 1083 487
## 4 query_1 TRUE 2.427372e-28 7322 1083 560
## 5 query_1 TRUE 2.844991e-28 4727 1083 409
## 6 query_1 TRUE 3.384786e-28 2713 1083 278
## 7 query_1 TRUE 6.338640e-28 2956 1083 294
## 8 query_1 TRUE 1.002168e-27 6553 1083 515
## 9 query_1 TRUE 9.683175e-27 1793 1083 208
## 10 query_1 TRUE 1.670193e-25 1145 1083 154
## precision recall term_id source
## 1 0.3425669 0.09153713 GO:0048731 GO:BP
## 2 0.1782087 0.12581486 GO:0016477 GO:BP
## 3 0.4496768 0.08120727 GO:0048856 GO:BP
## 4 0.5170822 0.07648184 GO:0032501 GO:BP
## 5 0.3776547 0.08652422 GO:0007275 GO:BP
## 6 0.2566944 0.10246959 GO:0009653 GO:BP
## 7 0.2714681 0.09945873 GO:0051239 GO:BP
## 8 0.4755309 0.07858996 GO:0032502 GO:BP
## 9 0.1920591 0.11600669 GO:0048870 GO:BP
## 10 0.1421976 0.13449782 GO:0072359 GO:BP
## term_name effective_domain_size
## 1 system development 21026
## 2 cell migration 21026
## 3 anatomical structure development 21026
## 4 multicellular organismal process 21026
## 5 multicellular organism development 21026
## 6 anatomical structure morphogenesis 21026
## 7 regulation of multicellular organismal process 21026
## 8 developmental process 21026
## 9 cell motility 21026
## 10 circulatory system development 21026
## source_order parents
## 1 12821 GO:00072....
## 2 5204 GO:0048870
## 3 12931 GO:0032502
## 4 7557 GO:0008150
## 5 2788 GO:00325....
## 6 3557 GO:00325....
## 7 13407 GO:00325....
## 8 7558 GO:0008150
## 9 12944 GO:0009987
## 10 17136 GO:0048731
Contrasts MvsL, HvsM, HvsL were defined. After filtering 18,306 genes remained. TREAT results: MvsL -> 2 overexpressed; HvsM -> 18 under, 12 over; HvsL -> 698 under, 587 over. GO results for HvsL suggest underexpression of ECM, angiogenesis and cell adhesion related genes in high vs low mutation loads.
# Load into edgeR
y <- DGEList(counts = exp_matrix)
# Design for combined groups
Cancer_type <- meta_intersect$type
Mutation_count <- meta_intersect$mut_counts
Group <- factor(paste(Cancer_type,Mutation_count,sep="."))
design = model.matrix(~0+Group, data = y$samples)
colnames(design) <- levels(Group)
design## Metastasis.High Metastasis.Low Metastasis.Medium Primary.High
## ACH-000001 0 0 1 0
## ACH-000002 0 0 0 0
## ACH-000006 0 0 0 0
## ACH-000007 0 0 0 0
## ACH-000009 0 0 0 0
## ACH-000011 0 0 1 0
## ACH-000012 0 0 0 0
## ACH-000013 0 1 0 0
## ACH-000014 1 0 0 0
## ACH-000015 0 0 0 0
## ACH-000016 0 1 0 0
## ACH-000017 0 0 1 0
## ACH-000018 0 0 0 0
## ACH-000019 1 0 0 0
## ACH-000021 0 0 1 0
## ACH-000022 0 1 0 0
## ACH-000023 0 0 1 0
## ACH-000024 0 0 0 1
## ACH-000026 0 0 1 0
## ACH-000027 0 0 0 0
## ACH-000028 1 0 0 0
## ACH-000029 0 0 0 0
## ACH-000030 0 0 0 1
## ACH-000031 0 0 0 0
## ACH-000033 0 1 0 0
## ACH-000035 0 1 0 0
## ACH-000039 0 1 0 0
## ACH-000040 0 0 0 0
## ACH-000041 0 0 0 0
## ACH-000042 0 0 0 0
## ACH-000043 0 1 0 0
## ACH-000044 0 1 0 0
## ACH-000045 0 0 0 0
## ACH-000046 0 1 0 0
## ACH-000048 0 0 0 0
## ACH-000050 0 0 1 0
## ACH-000051 0 0 0 0
## ACH-000052 0 0 0 0
## ACH-000054 0 1 0 0
## ACH-000055 0 1 0 0
## ACH-000056 0 1 0 0
## ACH-000059 0 0 0 0
## ACH-000060 0 0 0 0
## ACH-000062 0 0 0 0
## ACH-000065 0 0 0 0
## ACH-000066 0 1 0 0
## ACH-000067 0 0 0 0
## ACH-000070 0 0 0 0
## ACH-000073 0 0 0 0
## ACH-000075 0 0 0 0
## ACH-000078 0 1 0 0
## ACH-000082 0 0 0 0
## ACH-000085 0 0 0 0
## ACH-000087 0 0 0 0
## ACH-000089 0 1 0 0
## ACH-000090 1 0 0 0
## ACH-000091 0 1 0 0
## ACH-000092 0 1 0 0
## ACH-000093 0 0 0 0
## ACH-000094 0 1 0 0
## ACH-000096 0 0 0 0
## ACH-000097 0 0 1 0
## ACH-000098 0 0 0 0
## ACH-000100 0 1 0 0
## ACH-000102 0 0 0 0
## ACH-000103 0 1 0 0
## ACH-000105 0 0 0 0
## ACH-000107 0 0 0 0
## ACH-000108 0 0 1 0
## ACH-000109 0 0 0 0
## ACH-000111 0 0 0 0
## ACH-000113 0 0 0 0
## ACH-000114 0 1 0 0
## ACH-000115 0 1 0 0
## ACH-000116 0 1 0 0
## ACH-000117 0 1 0 0
## ACH-000118 0 0 1 0
## ACH-000121 0 1 0 0
## ACH-000123 0 0 0 0
## ACH-000124 0 0 0 0
## ACH-000125 0 1 0 0
## ACH-000126 0 0 0 0
## ACH-000127 0 0 0 0
## ACH-000129 0 1 0 0
## ACH-000131 0 0 0 0
## ACH-000132 0 0 0 0
## ACH-000133 0 0 0 0
## ACH-000135 0 0 0 0
## ACH-000136 0 0 0 0
## ACH-000137 0 0 0 0
## ACH-000138 0 1 0 0
## ACH-000139 0 0 0 0
## ACH-000140 0 1 0 0
## ACH-000142 0 0 0 0
## ACH-000144 0 0 1 0
## ACH-000145 0 0 0 0
## ACH-000146 0 0 0 0
## ACH-000147 0 1 0 0
## ACH-000148 0 0 0 0
## ACH-000149 0 1 0 0
## ACH-000150 0 0 0 0
## ACH-000152 0 0 0 0
## ACH-000153 0 0 1 0
## ACH-000155 0 0 0 0
## ACH-000157 0 0 0 1
## ACH-000158 0 0 0 0
## ACH-000159 0 0 0 0
## ACH-000161 0 1 0 0
## ACH-000163 0 0 0 0
## ACH-000164 0 0 0 0
## ACH-000166 0 0 0 0
## ACH-000167 0 0 1 0
## ACH-000168 0 0 0 0
## ACH-000169 0 0 0 0
## ACH-000171 0 0 0 0
## ACH-000172 0 1 0 0
## ACH-000174 0 0 0 0
## ACH-000176 0 0 0 0
## ACH-000177 0 0 1 0
## ACH-000178 0 1 0 0
## ACH-000179 0 0 1 0
## ACH-000181 0 0 0 0
## ACH-000183 0 0 0 0
## ACH-000186 0 0 0 0
## ACH-000187 0 0 0 0
## ACH-000188 0 0 0 0
## ACH-000189 0 0 0 0
## ACH-000191 0 1 0 0
## ACH-000192 0 0 0 0
## ACH-000193 0 0 0 0
## ACH-000194 0 0 0 0
## ACH-000196 0 0 0 1
## ACH-000197 0 0 0 0
## ACH-000200 0 0 0 0
## ACH-000201 0 0 0 0
## ACH-000202 0 0 0 0
## ACH-000203 0 0 1 0
## ACH-000204 0 0 0 0
## ACH-000205 0 0 1 0
## ACH-000207 0 0 1 0
## ACH-000210 0 0 1 0
## ACH-000211 0 0 0 0
## ACH-000212 0 1 0 0
## ACH-000213 0 0 1 0
## ACH-000217 0 0 0 0
## ACH-000219 0 0 0 1
## ACH-000221 0 0 0 0
## ACH-000222 0 0 1 0
## ACH-000223 0 0 0 0
## ACH-000227 0 0 1 0
## ACH-000228 0 0 0 0
## ACH-000229 0 0 0 0
## ACH-000231 0 0 0 0
## ACH-000232 0 0 0 0
## ACH-000234 0 0 0 0
## ACH-000235 0 0 0 0
## ACH-000236 0 0 0 0
## ACH-000237 0 0 0 0
## ACH-000238 0 0 0 0
## ACH-000239 0 0 1 0
## ACH-000240 0 0 0 0
## ACH-000242 0 0 0 0
## ACH-000244 0 0 0 0
## ACH-000246 0 0 1 0
## ACH-000247 0 0 1 0
## ACH-000248 1 0 0 0
## ACH-000249 0 0 0 0
## ACH-000250 0 0 0 0
## ACH-000252 0 0 0 0
## ACH-000253 0 0 1 0
## ACH-000255 0 0 1 0
## ACH-000256 0 1 0 0
## ACH-000257 0 0 1 0
## ACH-000258 0 0 1 0
## ACH-000259 0 0 0 0
## ACH-000260 0 1 0 0
## ACH-000261 0 0 1 0
## ACH-000263 0 0 0 0
## ACH-000264 0 0 1 0
## ACH-000265 0 0 1 0
## ACH-000269 0 0 0 0
## ACH-000270 0 0 0 0
## ACH-000271 0 1 0 0
## ACH-000272 0 0 0 0
## ACH-000273 0 0 0 0
## ACH-000275 0 1 0 0
## ACH-000276 0 0 0 0
## ACH-000277 0 0 0 0
## ACH-000278 0 0 1 0
## ACH-000280 0 0 0 0
## ACH-000281 0 0 0 0
## ACH-000282 0 0 1 0
## ACH-000285 0 0 0 0
## ACH-000286 0 0 0 0
## ACH-000288 0 0 0 0
## ACH-000290 0 0 1 0
## ACH-000291 0 1 0 0
## ACH-000292 0 0 1 0
## ACH-000293 0 0 0 0
## ACH-000294 0 0 0 0
## ACH-000296 1 0 0 0
## ACH-000297 0 0 1 0
## ACH-000298 0 0 1 0
## ACH-000300 0 1 0 0
## ACH-000301 0 0 0 0
## ACH-000302 0 0 0 0
## ACH-000303 0 0 1 0
## ACH-000304 1 0 0 0
## ACH-000306 0 0 0 0
## ACH-000307 0 0 1 0
## ACH-000308 0 0 1 0
## ACH-000310 0 1 0 0
## ACH-000311 0 0 1 0
## ACH-000312 0 0 1 0
## ACH-000313 0 0 0 0
## ACH-000314 0 0 0 1
## ACH-000315 0 0 1 0
## ACH-000317 0 0 0 0
## ACH-000318 0 0 0 0
## ACH-000320 0 0 0 0
## ACH-000322 0 0 1 0
## ACH-000323 0 0 0 0
## ACH-000324 0 0 0 0
## ACH-000325 0 0 1 0
## ACH-000327 0 0 0 0
## ACH-000329 0 0 0 0
## ACH-000330 0 0 1 0
## ACH-000332 0 0 1 0
## ACH-000333 0 0 0 0
## ACH-000334 0 0 1 0
## ACH-000335 0 1 0 0
## ACH-000336 0 0 0 0
## ACH-000338 0 0 0 0
## ACH-000341 0 0 1 0
## ACH-000343 0 0 0 0
## ACH-000344 0 0 1 0
## ACH-000345 0 0 1 0
## ACH-000347 0 0 0 1
## ACH-000348 0 0 1 0
## ACH-000349 0 0 0 0
## ACH-000350 1 0 0 0
## ACH-000351 0 0 1 0
## ACH-000352 0 0 1 0
## ACH-000353 0 0 0 1
## ACH-000354 0 0 1 0
## ACH-000355 0 0 1 0
## ACH-000356 0 0 1 0
## ACH-000358 0 0 1 0
## ACH-000359 0 0 0 0
## ACH-000360 0 0 1 0
## ACH-000361 0 1 0 0
## ACH-000362 0 0 0 0
## ACH-000363 0 0 0 0
## ACH-000364 0 0 0 0
## ACH-000365 0 0 1 0
## ACH-000366 0 0 1 0
## ACH-000367 0 1 0 0
## ACH-000368 0 0 0 0
## ACH-000373 0 0 0 0
## ACH-000374 0 0 0 0
## ACH-000375 0 0 0 0
## ACH-000376 0 0 0 0
## ACH-000378 0 0 1 0
## ACH-000379 0 0 1 0
## ACH-000380 0 0 0 0
## ACH-000381 1 0 0 0
## ACH-000382 0 0 1 0
## ACH-000383 0 0 0 0
## ACH-000384 0 0 0 0
## ACH-000386 0 0 0 0
## ACH-000388 0 0 1 0
## ACH-000389 0 0 0 0
## ACH-000390 0 0 0 0
## ACH-000391 0 0 1 0
## ACH-000392 0 0 1 0
## ACH-000393 0 0 0 0
## ACH-000394 1 0 0 0
## ACH-000395 0 0 0 1
## ACH-000396 0 0 0 0
## ACH-000397 0 0 0 0
## ACH-000399 0 0 1 0
## ACH-000400 0 0 1 0
## ACH-000401 0 0 1 0
## ACH-000403 0 1 0 0
## ACH-000404 0 0 0 0
## ACH-000406 0 0 1 0
## ACH-000407 0 0 0 0
## ACH-000409 0 0 1 0
## ACH-000410 0 0 0 0
## ACH-000411 0 0 0 0
## ACH-000412 0 0 0 0
## ACH-000414 1 0 0 0
## ACH-000415 0 0 0 0
## ACH-000416 1 0 0 0
## ACH-000417 0 0 0 0
## ACH-000418 0 0 0 0
## ACH-000419 0 0 0 1
## ACH-000420 0 0 0 0
## ACH-000421 0 0 0 0
## ACH-000422 0 0 0 0
## ACH-000423 0 0 1 0
## ACH-000424 0 0 0 0
## ACH-000427 0 0 1 0
## ACH-000429 0 0 0 1
## ACH-000430 0 0 0 0
## ACH-000431 1 0 0 0
## ACH-000433 0 1 0 0
## ACH-000434 1 0 0 0
## ACH-000435 0 1 0 0
## ACH-000437 0 0 0 0
## ACH-000438 0 0 0 0
## ACH-000441 0 0 1 0
## ACH-000442 0 0 0 0
## ACH-000443 0 0 0 0
## ACH-000444 1 0 0 0
## ACH-000445 0 0 0 0
## ACH-000446 0 0 1 0
## ACH-000447 0 0 0 0
## ACH-000448 1 0 0 0
## ACH-000449 0 0 0 0
## ACH-000450 0 0 0 0
## ACH-000451 0 0 0 0
## ACH-000453 0 0 0 0
## ACH-000454 0 0 1 0
## ACH-000456 0 0 0 0
## ACH-000457 0 0 1 0
## ACH-000458 0 0 1 0
## ACH-000459 0 0 0 0
## ACH-000460 0 0 1 0
## ACH-000463 1 0 0 0
## ACH-000464 0 0 0 0
## ACH-000465 0 0 1 0
## ACH-000466 0 0 1 0
## ACH-000468 0 0 0 0
## ACH-000469 0 0 0 0
## ACH-000470 0 0 0 1
## ACH-000472 1 0 0 0
## ACH-000473 0 0 0 1
## ACH-000475 0 0 0 1
## ACH-000476 0 0 0 0
## ACH-000477 0 0 1 0
## ACH-000478 0 0 0 0
## ACH-000479 0 0 0 0
## ACH-000480 0 0 0 0
## ACH-000481 0 0 0 1
## ACH-000482 0 0 0 0
## ACH-000483 0 0 0 0
## ACH-000484 0 0 0 0
## ACH-000485 0 0 1 0
## ACH-000487 0 0 1 0
## ACH-000488 0 0 0 1
## ACH-000489 0 0 0 0
## ACH-000490 0 0 1 0
## ACH-000491 1 0 0 0
## ACH-000493 0 0 0 0
## ACH-000495 0 0 0 0
## ACH-000496 0 0 1 0
## ACH-000501 0 0 0 1
## ACH-000502 0 1 0 0
## ACH-000503 0 0 0 0
## ACH-000504 0 0 0 0
## ACH-000505 0 0 0 0
## ACH-000506 0 0 1 0
## ACH-000507 1 0 0 0
## ACH-000508 1 0 0 0
## ACH-000510 1 0 0 0
## ACH-000511 1 0 0 0
## ACH-000514 1 0 0 0
## ACH-000515 1 0 0 0
## ACH-000517 0 0 1 0
## ACH-000518 0 0 0 0
## ACH-000520 0 0 1 0
## ACH-000521 1 0 0 0
## ACH-000522 0 0 0 0
## ACH-000523 1 0 0 0
## ACH-000524 0 0 1 0
## ACH-000525 0 0 1 0
## ACH-000526 0 0 0 0
## ACH-000527 1 0 0 0
## ACH-000528 1 0 0 0
## ACH-000530 0 0 0 0
## ACH-000532 0 0 1 0
## ACH-000534 0 0 1 0
## ACH-000535 0 0 0 0
## ACH-000536 0 0 0 1
## ACH-000538 0 0 0 0
## ACH-000539 0 0 0 0
## ACH-000541 0 0 0 1
## ACH-000542 0 0 1 0
## ACH-000544 0 0 0 0
## ACH-000545 0 0 0 1
## ACH-000546 1 0 0 0
## ACH-000547 0 0 0 0
## ACH-000548 0 0 0 1
## ACH-000550 0 0 0 1
## ACH-000551 1 0 0 0
## ACH-000552 0 0 0 1
## ACH-000553 0 0 0 1
## ACH-000554 0 0 0 1
## ACH-000555 0 0 0 0
## ACH-000557 0 0 0 0
## ACH-000558 0 0 0 0
## ACH-000559 0 0 0 1
## ACH-000561 1 0 0 0
## ACH-000562 0 0 1 0
## ACH-000563 1 0 0 0
## ACH-000564 0 0 0 1
## ACH-000565 0 0 0 1
## ACH-000566 0 0 0 0
## ACH-000568 0 0 0 0
## ACH-000569 0 0 0 0
## ACH-000570 0 0 0 1
## ACH-000571 0 0 0 0
## ACH-000572 0 0 0 0
## ACH-000573 0 0 1 0
## ACH-000574 0 0 0 1
## ACH-000576 0 0 0 1
## ACH-000577 0 0 1 0
## ACH-000578 0 0 0 0
## ACH-000579 0 0 0 1
## ACH-000580 0 0 0 0
## ACH-000581 0 0 1 0
## ACH-000582 1 0 0 0
## ACH-000584 0 0 0 1
## ACH-000585 0 0 0 1
## ACH-000586 1 0 0 0
## ACH-000587 0 0 0 0
## ACH-000588 0 0 0 1
## ACH-000589 1 0 0 0
## ACH-000590 0 0 0 1
## ACH-000593 0 0 0 0
## ACH-000594 1 0 0 0
## ACH-000595 0 0 0 1
## ACH-000596 0 0 0 1
## ACH-000598 0 0 0 1
## ACH-000599 0 0 0 0
## ACH-000600 0 0 0 0
## ACH-000601 0 0 0 0
## ACH-000603 1 0 0 0
## ACH-000606 0 0 0 0
## ACH-000607 0 0 0 1
## ACH-000608 0 0 0 0
## ACH-000609 0 0 0 0
## ACH-000610 0 0 0 1
## ACH-000611 1 0 0 0
## ACH-000613 0 0 0 0
## ACH-000614 1 0 0 0
## ACH-000616 1 0 0 0
## ACH-000617 0 0 1 0
## ACH-000619 0 0 0 0
## ACH-000620 0 0 0 1
## ACH-000621 1 0 0 0
## ACH-000622 0 0 0 1
## ACH-000623 0 0 0 0
## ACH-000624 0 0 0 0
## ACH-000625 0 0 0 0
## ACH-000627 1 0 0 0
## ACH-000628 0 0 0 1
## ACH-000631 0 0 0 1
## ACH-000632 1 0 0 0
## ACH-000633 0 0 0 1
## ACH-000635 1 0 0 0
## ACH-000637 0 0 0 1
## ACH-000638 1 0 0 0
## ACH-000639 1 0 0 0
## ACH-000640 0 0 0 1
## ACH-000643 0 0 0 1
## ACH-000644 0 0 0 0
## ACH-000645 0 0 0 0
## ACH-000646 0 0 0 1
## ACH-000647 0 0 0 1
## ACH-000648 0 0 1 0
## ACH-000649 0 0 0 1
## ACH-000650 1 0 0 0
## ACH-000651 1 0 0 0
## ACH-000652 1 0 0 0
## ACH-000653 0 0 0 0
## ACH-000655 0 0 0 0
## ACH-000656 1 0 0 0
## ACH-000657 0 0 0 1
## ACH-000658 0 0 0 1
## ACH-000659 1 0 0 0
## ACH-000660 0 0 0 0
## ACH-000662 1 0 0 0
## ACH-000663 1 0 0 0
## ACH-000665 1 0 0 0
## ACH-000666 1 0 0 0
## ACH-000667 0 0 0 1
## ACH-000668 0 0 0 0
## ACH-000669 0 0 0 1
## ACH-000670 1 0 0 0
## ACH-000672 1 0 0 0
## ACH-000674 1 0 0 0
## ACH-000675 0 0 0 1
## ACH-000677 0 0 0 0
## ACH-000678 1 0 0 0
## ACH-000679 0 0 0 1
## ACH-000680 0 0 0 1
## ACH-000681 0 0 0 1
## ACH-000683 1 0 0 0
## ACH-000684 0 0 0 0
## ACH-000685 0 0 1 0
## ACH-000686 0 0 0 0
## ACH-000688 0 0 0 0
## ACH-000691 0 0 0 1
## ACH-000692 0 0 0 0
## ACH-000693 0 0 0 1
## ACH-000694 1 0 0 0
## ACH-000695 1 0 0 0
## ACH-000696 0 0 0 1
## ACH-000697 0 0 0 1
## ACH-000698 0 0 0 1
## ACH-000699 0 0 0 1
## ACH-000701 0 0 0 0
## ACH-000703 1 0 0 0
## ACH-000704 0 0 1 0
## ACH-000705 0 0 0 1
## ACH-000706 0 0 0 1
## ACH-000708 0 0 1 0
## ACH-000709 0 0 0 0
## ACH-000710 0 0 0 0
## ACH-000711 0 1 0 0
## ACH-000712 0 0 0 0
## ACH-000713 0 0 0 0
## ACH-000714 0 0 0 1
## ACH-000718 1 0 0 0
## ACH-000719 0 0 1 0
## ACH-000720 0 0 0 0
## ACH-000721 1 0 0 0
## ACH-000722 1 0 0 0
## ACH-000724 0 0 0 0
## ACH-000725 0 0 0 1
## ACH-000729 0 0 0 1
## ACH-000730 1 0 0 0
## ACH-000731 0 0 0 0
## ACH-000732 0 0 0 0
## ACH-000733 0 0 0 1
## ACH-000734 0 0 0 1
## ACH-000735 0 0 0 0
## ACH-000736 1 0 0 0
## ACH-000737 0 0 1 0
## ACH-000738 0 0 0 1
## ACH-000739 0 0 0 0
## ACH-000740 0 0 0 1
## ACH-000741 0 0 0 0
## ACH-000743 0 0 0 1
## ACH-000744 1 0 0 0
## ACH-000745 0 0 0 1
## ACH-000746 1 0 0 0
## ACH-000747 0 0 0 1
## ACH-000748 0 0 0 0
## ACH-000749 1 0 0 0
## ACH-000750 1 0 0 0
## ACH-000752 1 0 0 0
## ACH-000753 1 0 0 0
## ACH-000755 0 0 0 0
## ACH-000756 0 0 0 1
## ACH-000758 1 0 0 0
## ACH-000759 1 0 0 0
## ACH-000761 0 0 1 0
## ACH-000763 0 0 0 1
## ACH-000764 0 0 0 1
## ACH-000765 0 0 1 0
## ACH-000766 1 0 0 0
## ACH-000767 1 0 0 0
## ACH-000768 1 0 0 0
## ACH-000769 0 0 0 1
## ACH-000771 0 0 0 0
## ACH-000774 0 0 0 1
## ACH-000775 0 0 0 1
## ACH-000776 0 0 0 1
## ACH-000777 0 0 0 1
## ACH-000778 1 0 0 0
## ACH-000780 1 0 0 0
## ACH-000781 1 0 0 0
## ACH-000782 0 0 0 1
## ACH-000783 1 0 0 0
## ACH-000784 0 0 0 1
## ACH-000785 1 0 0 0
## ACH-000787 0 0 0 1
## ACH-000788 1 0 0 0
## ACH-000789 0 0 0 1
## ACH-000790 0 0 0 1
## ACH-000791 0 0 0 1
## ACH-000792 0 0 0 0
## ACH-000793 1 0 0 0
## ACH-000794 0 0 1 0
## ACH-000796 0 0 0 1
## ACH-000797 0 0 0 0
## ACH-000798 0 0 0 0
## ACH-000799 1 0 0 0
## ACH-000800 1 0 0 0
## ACH-000802 0 0 0 1
## ACH-000803 1 0 0 0
## ACH-000804 1 0 0 0
## ACH-000805 1 0 0 0
## ACH-000808 0 0 0 1
## ACH-000809 0 0 0 1
## ACH-000810 1 0 0 0
## ACH-000811 1 0 0 0
## ACH-000812 1 0 0 0
## ACH-000813 1 0 0 0
## ACH-000815 1 0 0 0
## ACH-000816 1 0 0 0
## ACH-000817 0 0 0 1
## ACH-000818 0 0 0 1
## ACH-000819 0 0 0 1
## ACH-000820 0 0 0 1
## ACH-000821 0 0 0 1
## ACH-000822 1 0 0 0
## ACH-000823 0 0 0 1
## ACH-000824 0 0 0 1
## ACH-000825 0 0 0 1
## ACH-000826 0 0 0 1
## ACH-000828 1 0 0 0
## ACH-000830 1 0 0 0
## ACH-000831 1 0 0 0
## ACH-000832 0 0 0 1
## ACH-000833 1 0 0 0
## ACH-000834 0 0 1 0
## ACH-000835 1 0 0 0
## ACH-000837 0 0 0 1
## ACH-000838 0 0 0 1
## ACH-000839 0 0 0 1
## ACH-000840 0 0 0 1
## ACH-000841 1 0 0 0
## ACH-000842 0 0 0 1
## ACH-000843 1 0 0 0
## ACH-000844 0 0 0 1
## ACH-000845 0 0 0 1
## ACH-000846 0 0 0 1
## ACH-000847 1 0 0 0
## ACH-000848 0 0 0 1
## ACH-000849 1 0 0 0
## ACH-000850 0 1 0 0
## ACH-000851 0 0 0 1
## ACH-000852 0 0 0 1
## ACH-000853 1 0 0 0
## ACH-000855 0 0 0 1
## ACH-000856 1 0 0 0
## ACH-000857 0 0 0 0
## ACH-000858 1 0 0 0
## ACH-000859 0 0 0 1
## ACH-000860 0 0 0 1
## ACH-000861 0 0 0 1
## ACH-000862 0 0 0 1
## ACH-000863 0 0 0 1
## ACH-000864 0 0 0 1
## ACH-000865 0 0 0 1
## ACH-000866 1 0 0 0
## ACH-000867 1 0 0 0
## ACH-000868 0 0 0 1
## ACH-000869 1 0 0 0
## ACH-000870 1 0 0 0
## ACH-000871 1 0 0 0
## ACH-000873 0 0 0 1
## ACH-000874 0 0 0 1
## ACH-000875 0 0 0 1
## ACH-000876 1 0 0 0
## ACH-000877 0 0 0 1
## ACH-000878 0 0 0 1
## ACH-000879 0 0 0 1
## ACH-000880 0 0 0 1
## ACH-000881 0 0 0 1
## ACH-000882 1 0 0 0
## ACH-000883 0 0 0 1
## ACH-000884 1 0 0 0
## ACH-000885 0 0 0 1
## ACH-000886 1 0 0 0
## ACH-000888 0 0 0 1
## ACH-000889 0 0 0 1
## ACH-000890 0 0 0 1
## ACH-000891 0 0 0 1
## ACH-000892 0 0 0 1
## ACH-000893 0 0 0 1
## ACH-000894 1 0 0 0
## ACH-000895 0 0 0 1
## ACH-000896 0 0 0 1
## ACH-000897 1 0 0 0
## ACH-000898 0 0 0 1
## ACH-000899 1 0 0 0
## ACH-000900 0 0 0 1
## ACH-000901 0 0 0 1
## ACH-000902 1 0 0 0
## ACH-000903 1 0 0 0
## ACH-000904 1 0 0 0
## ACH-000906 0 0 0 1
## ACH-000907 0 0 0 1
## ACH-000908 0 0 0 1
## ACH-000909 0 0 0 1
## ACH-000910 1 0 0 0
## ACH-000911 1 0 0 0
## ACH-000912 0 0 0 1
## ACH-000913 0 0 0 1
## ACH-000915 0 0 0 1
## ACH-000916 1 0 0 0
## ACH-000919 0 0 0 1
## ACH-000921 1 0 0 0
## ACH-000924 0 0 0 1
## ACH-000925 1 0 0 0
## ACH-000926 0 0 0 1
## ACH-000927 0 0 0 1
## ACH-000928 0 0 0 1
## ACH-000929 1 0 0 0
## ACH-000930 0 0 0 1
## ACH-000931 0 0 0 1
## ACH-000932 1 0 0 0
## ACH-000934 1 0 0 0
## ACH-000936 1 0 0 0
## ACH-000938 0 0 0 1
## ACH-000939 0 0 0 1
## ACH-000940 1 0 0 0
## ACH-000941 0 0 0 1
## ACH-000942 0 0 0 1
## ACH-000943 0 0 0 1
## ACH-000945 1 0 0 0
## ACH-000946 0 0 0 1
## ACH-000947 1 0 0 0
## ACH-000948 0 0 0 1
## ACH-000949 1 0 0 0
## ACH-000950 1 0 0 0
## ACH-000951 0 0 0 1
## ACH-000952 1 0 0 0
## ACH-000953 1 0 0 0
## ACH-000954 0 0 0 1
## ACH-000955 1 0 0 0
## ACH-000956 0 0 0 1
## ACH-000957 0 0 0 1
## ACH-000958 0 0 0 1
## ACH-000960 0 0 0 1
## ACH-000961 0 0 0 1
## ACH-000962 1 0 0 0
## ACH-000963 1 0 0 0
## ACH-000965 0 0 0 1
## ACH-000966 0 0 0 1
## ACH-000967 0 0 0 1
## ACH-000968 1 0 0 0
## ACH-000969 0 0 0 1
## ACH-000971 0 0 0 1
## ACH-000972 0 0 0 1
## ACH-000973 0 0 0 1
## ACH-000974 1 0 0 0
## ACH-000976 1 0 0 0
## ACH-000977 1 0 0 0
## ACH-000978 0 0 0 1
## ACH-000980 1 0 0 0
## ACH-000981 0 0 0 1
## ACH-000982 0 0 0 1
## ACH-000984 0 0 0 1
## ACH-000985 0 0 0 1
## ACH-000987 1 0 0 0
## ACH-000989 1 0 0 0
## ACH-000990 0 0 0 1
## ACH-000993 0 0 0 1
## ACH-000994 0 0 0 1
## ACH-000995 0 0 0 1
## ACH-000996 0 0 0 1
## ACH-000997 0 0 0 1
## ACH-000998 0 0 0 1
## ACH-001001 0 0 0 0
## ACH-001041 0 1 0 0
## ACH-001048 0 1 0 0
## ACH-001061 0 0 0 1
## ACH-001075 0 0 1 0
## ACH-001078 0 0 0 0
## ACH-001106 0 0 0 0
## ACH-001113 0 0 0 1
## ACH-001129 0 0 0 0
## ACH-001145 1 0 0 0
## ACH-001151 0 1 0 0
## ACH-001163 0 1 0 0
## ACH-001184 1 0 0 0
## ACH-001190 1 0 0 0
## ACH-001192 0 1 0 0
## ACH-001194 0 1 0 0
## ACH-001200 0 0 1 0
## ACH-001210 0 1 0 0
## ACH-001229 0 0 0 0
## ACH-001239 1 0 0 0
## ACH-001277 0 0 1 0
## ACH-001278 0 1 0 0
## ACH-001283 0 1 0 0
## ACH-001306 0 0 0 0
## ACH-001307 0 0 0 0
## ACH-001318 0 0 0 1
## ACH-001321 0 0 0 0
## ACH-001328 0 0 0 1
## ACH-001329 0 0 0 0
## ACH-001332 0 0 0 0
## ACH-001333 0 0 0 1
## ACH-001334 0 0 0 0
## ACH-001335 0 0 0 0
## ACH-001336 1 0 0 0
## ACH-001339 0 0 0 0
## ACH-001340 0 1 0 0
## ACH-001341 0 0 0 0
## ACH-001344 1 0 0 0
## ACH-001345 0 0 0 1
## ACH-001346 0 0 0 0
## ACH-001347 0 0 0 0
## ACH-001353 0 0 0 0
## ACH-001354 0 0 0 0
## ACH-001356 0 0 0 0
## ACH-001360 0 1 0 0
## ACH-001366 1 0 0 0
## ACH-001367 1 0 0 0
## ACH-001368 0 1 0 0
## ACH-001369 0 0 0 1
## ACH-001370 0 0 0 0
## ACH-001373 0 1 0 0
## ACH-001374 0 0 1 0
## ACH-001375 0 0 0 0
## ACH-001376 0 1 0 0
## ACH-001377 0 0 0 1
## ACH-001378 0 0 0 0
## ACH-001379 0 1 0 0
## ACH-001380 0 1 0 0
## ACH-001382 0 1 0 0
## ACH-001384 0 1 0 0
## ACH-001385 0 1 0 0
## ACH-001386 1 0 0 0
## ACH-001388 0 0 0 0
## ACH-001389 0 0 1 0
## ACH-001390 0 0 0 1
## ACH-001391 0 0 0 0
## ACH-001392 0 1 0 0
## ACH-001394 0 1 0 0
## ACH-001395 0 0 1 0
## ACH-001396 0 1 0 0
## ACH-001398 0 0 0 1
## ACH-001399 0 0 1 0
## ACH-001400 0 0 0 0
## ACH-001401 0 0 0 1
## ACH-001402 0 0 0 1
## ACH-001403 0 1 0 0
## ACH-001407 1 0 0 0
## ACH-001408 0 0 1 0
## ACH-001409 0 0 0 0
## ACH-001410 0 1 0 0
## ACH-001411 0 0 0 0
## ACH-001412 0 0 0 1
## ACH-001413 0 0 0 1
## ACH-001414 0 0 0 1
## ACH-001415 0 0 0 1
## ACH-001416 0 0 0 1
## ACH-001418 0 0 0 0
## ACH-001419 0 0 0 0
## ACH-001421 0 0 0 0
## ACH-001422 0 0 0 0
## ACH-001433 0 1 0 0
## ACH-001441 0 1 0 0
## ACH-001442 0 0 1 0
## ACH-001443 0 0 0 1
## ACH-001450 0 0 0 0
## ACH-001451 0 0 0 0
## ACH-001453 0 0 0 0
## ACH-001454 0 0 0 0
## ACH-001456 0 0 0 0
## ACH-001458 0 0 0 0
## ACH-001459 0 0 0 0
## ACH-001460 0 0 0 0
## ACH-001461 0 0 0 0
## ACH-001484 0 0 0 1
## ACH-001485 0 0 0 1
## ACH-001494 0 0 0 0
## ACH-001495 0 0 0 0
## ACH-001496 0 0 0 0
## ACH-001497 0 0 0 0
## ACH-001498 1 0 0 0
## ACH-001500 0 0 0 0
## ACH-001509 0 0 0 0
## ACH-001510 0 0 0 0
## ACH-001511 0 0 0 0
## ACH-001513 1 0 0 0
## ACH-001515 1 0 0 0
## ACH-001516 0 0 0 1
## ACH-001517 0 0 0 1
## ACH-001518 0 0 0 1
## ACH-001519 0 0 0 0
## ACH-001520 0 0 0 0
## ACH-001521 0 0 0 1
## ACH-001522 0 0 1 0
## ACH-001523 0 0 0 1
## ACH-001524 0 0 0 1
## ACH-001525 1 0 0 0
## ACH-001526 0 0 0 0
## ACH-001528 1 0 0 0
## ACH-001529 0 0 0 1
## ACH-001530 1 0 0 0
## ACH-001532 0 1 0 0
## ACH-001536 0 0 0 1
## ACH-001538 0 0 0 0
## ACH-001539 0 0 0 1
## ACH-001540 0 0 0 0
## ACH-001541 0 0 0 0
## ACH-001542 1 0 0 0
## ACH-001543 0 0 0 0
## ACH-001548 0 1 0 0
## ACH-001549 0 0 0 1
## ACH-001550 1 0 0 0
## ACH-001551 1 0 0 0
## ACH-001552 0 0 0 1
## ACH-001554 0 0 0 0
## ACH-001555 0 0 0 0
## ACH-001556 0 0 0 0
## ACH-001557 0 0 0 0
## ACH-001558 0 0 0 0
## ACH-001559 0 0 0 0
## ACH-001560 0 0 0 0
## ACH-001561 0 0 0 0
## ACH-001562 0 0 0 0
## ACH-001563 1 0 0 0
## ACH-001566 1 0 0 0
## ACH-001567 1 0 0 0
## ACH-001568 1 0 0 0
## ACH-001569 1 0 0 0
## ACH-001570 1 0 0 0
## ACH-001573 0 1 0 0
## ACH-001574 0 0 0 0
## ACH-001577 0 0 0 0
## ACH-001578 0 0 0 0
## ACH-001603 0 0 0 0
## ACH-001605 0 0 0 0
## ACH-001607 0 0 1 0
## ACH-001608 0 0 0 1
## ACH-001609 0 0 0 0
## ACH-001610 0 0 0 1
## ACH-001611 0 0 0 0
## ACH-001613 0 0 0 0
## ACH-001616 0 0 0 0
## ACH-001617 0 0 0 0
## ACH-001618 0 0 0 0
## ACH-001619 0 0 1 0
## ACH-001622 0 0 0 0
## ACH-001623 0 0 0 1
## ACH-001624 0 0 0 0
## ACH-001625 1 0 0 0
## ACH-001626 0 0 1 0
## ACH-001627 0 0 0 0
## ACH-001628 0 1 0 0
## ACH-001630 0 0 1 0
## ACH-001632 0 1 0 0
## ACH-001634 0 0 0 0
## ACH-001636 0 0 0 0
## ACH-001638 0 0 0 1
## ACH-001642 0 0 0 0
## ACH-001645 0 0 0 1
## ACH-001647 0 0 1 0
## ACH-001648 0 0 0 0
## ACH-001649 0 0 0 0
## ACH-001650 0 0 0 1
## ACH-001651 0 0 0 0
## ACH-001652 0 0 0 0
## ACH-001653 0 0 0 1
## ACH-001654 0 0 0 0
## ACH-001655 0 0 0 0
## ACH-001656 0 0 1 0
## ACH-001664 1 0 0 0
## ACH-001668 0 0 0 0
## ACH-001670 0 0 0 0
## ACH-001673 0 0 0 0
## ACH-001674 0 0 0 1
## ACH-001677 0 0 0 0
## ACH-001685 0 0 0 0
## ACH-001687 0 0 0 0
## ACH-001688 0 0 0 0
## ACH-001690 0 0 0 0
## ACH-001692 0 0 0 0
## ACH-001694 0 0 0 1
## ACH-001698 0 0 0 0
## ACH-001699 0 0 0 0
## ACH-001702 0 1 0 0
## ACH-001703 0 1 0 0
## ACH-001709 0 0 0 1
## ACH-001711 0 0 0 0
## ACH-001719 0 0 0 1
## ACH-001740 0 0 1 0
## ACH-001765 0 1 0 0
## ACH-001786 0 0 0 1
## ACH-001794 0 0 0 0
## ACH-001807 0 0 0 0
## ACH-001814 0 0 0 0
## ACH-001818 0 1 0 0
## ACH-001819 0 0 1 0
## ACH-001820 0 0 1 0
## ACH-001850 0 0 1 0
## ACH-001861 0 0 1 0
## ACH-001961 0 1 0 0
## ACH-001991 0 0 0 1
## ACH-002011 0 0 0 0
## ACH-002015 0 0 0 0
## ACH-002016 0 0 0 0
## ACH-002017 0 0 1 0
## ACH-002018 0 1 0 0
## ACH-002019 0 0 0 0
## ACH-002022 0 0 0 1
## ACH-002023 0 0 0 0
## ACH-002024 0 0 0 1
## ACH-002025 0 0 0 0
## ACH-002026 0 0 0 1
## ACH-002027 0 0 0 1
## ACH-002029 0 0 0 1
## ACH-002038 0 0 0 0
## ACH-002039 0 0 1 0
## ACH-002041 0 0 0 1
## ACH-002042 0 0 0 0
## ACH-002044 0 0 0 0
## ACH-002045 0 0 0 1
## ACH-002046 0 0 0 0
## ACH-002059 0 0 0 1
## ACH-002062 0 0 0 1
## ACH-002065 0 0 1 0
## ACH-002066 0 0 0 0
## ACH-002067 0 0 0 0
## ACH-002069 0 0 0 0
## ACH-002446 0 0 1 0
## ACH-002508 1 0 0 0
## ACH-002509 1 0 0 0
## ACH-002510 1 0 0 0
## ACH-002511 0 0 0 0
## Primary.Low Primary.Medium
## ACH-000001 0 0
## ACH-000002 1 0
## ACH-000006 1 0
## ACH-000007 0 1
## ACH-000009 0 1
## ACH-000011 0 0
## ACH-000012 1 0
## ACH-000013 0 0
## ACH-000014 0 0
## ACH-000015 0 1
## ACH-000016 0 0
## ACH-000017 0 0
## ACH-000018 1 0
## ACH-000019 0 0
## ACH-000021 0 0
## ACH-000022 0 0
## ACH-000023 0 0
## ACH-000024 0 0
## ACH-000026 0 0
## ACH-000027 1 0
## ACH-000028 0 0
## ACH-000029 0 1
## ACH-000030 0 0
## ACH-000031 1 0
## ACH-000033 0 0
## ACH-000035 0 0
## ACH-000039 0 0
## ACH-000040 1 0
## ACH-000041 1 0
## ACH-000042 1 0
## ACH-000043 0 0
## ACH-000044 0 0
## ACH-000045 1 0
## ACH-000046 0 0
## ACH-000048 1 0
## ACH-000050 0 0
## ACH-000051 1 0
## ACH-000052 1 0
## ACH-000054 0 0
## ACH-000055 0 0
## ACH-000056 0 0
## ACH-000059 1 0
## ACH-000060 1 0
## ACH-000062 0 1
## ACH-000065 1 0
## ACH-000066 0 0
## ACH-000067 1 0
## ACH-000070 1 0
## ACH-000073 1 0
## ACH-000075 1 0
## ACH-000078 0 0
## ACH-000082 1 0
## ACH-000085 1 0
## ACH-000087 1 0
## ACH-000089 0 0
## ACH-000090 0 0
## ACH-000091 0 0
## ACH-000092 0 0
## ACH-000093 1 0
## ACH-000094 0 0
## ACH-000096 1 0
## ACH-000097 0 0
## ACH-000098 1 0
## ACH-000100 0 0
## ACH-000102 1 0
## ACH-000103 0 0
## ACH-000105 1 0
## ACH-000107 1 0
## ACH-000108 0 0
## ACH-000109 0 1
## ACH-000111 1 0
## ACH-000113 1 0
## ACH-000114 0 0
## ACH-000115 0 0
## ACH-000116 0 0
## ACH-000117 0 0
## ACH-000118 0 0
## ACH-000121 0 0
## ACH-000123 1 0
## ACH-000124 1 0
## ACH-000125 0 0
## ACH-000126 0 1
## ACH-000127 0 1
## ACH-000129 0 0
## ACH-000131 1 0
## ACH-000132 0 1
## ACH-000133 1 0
## ACH-000135 1 0
## ACH-000136 1 0
## ACH-000137 1 0
## ACH-000138 0 0
## ACH-000139 1 0
## ACH-000140 0 0
## ACH-000142 1 0
## ACH-000144 0 0
## ACH-000145 1 0
## ACH-000146 0 1
## ACH-000147 0 0
## ACH-000148 1 0
## ACH-000149 0 0
## ACH-000150 1 0
## ACH-000152 1 0
## ACH-000153 0 0
## ACH-000155 1 0
## ACH-000157 0 0
## ACH-000158 0 1
## ACH-000159 0 1
## ACH-000161 0 0
## ACH-000163 1 0
## ACH-000164 1 0
## ACH-000166 0 1
## ACH-000167 0 0
## ACH-000168 0 1
## ACH-000169 0 1
## ACH-000171 1 0
## ACH-000172 0 0
## ACH-000174 1 0
## ACH-000176 1 0
## ACH-000177 0 0
## ACH-000178 0 0
## ACH-000179 0 0
## ACH-000181 0 1
## ACH-000183 0 1
## ACH-000186 0 1
## ACH-000187 1 0
## ACH-000188 1 0
## ACH-000189 0 1
## ACH-000191 0 0
## ACH-000192 0 1
## ACH-000193 1 0
## ACH-000194 1 0
## ACH-000196 0 0
## ACH-000197 0 1
## ACH-000200 0 1
## ACH-000201 1 0
## ACH-000202 0 1
## ACH-000203 0 0
## ACH-000204 0 1
## ACH-000205 0 0
## ACH-000207 0 0
## ACH-000210 0 0
## ACH-000211 0 1
## ACH-000212 0 0
## ACH-000213 0 0
## ACH-000217 0 1
## ACH-000219 0 0
## ACH-000221 0 1
## ACH-000222 0 0
## ACH-000223 0 1
## ACH-000227 0 0
## ACH-000228 1 0
## ACH-000229 1 0
## ACH-000231 0 1
## ACH-000232 0 1
## ACH-000234 1 0
## ACH-000235 0 1
## ACH-000236 0 1
## ACH-000237 0 1
## ACH-000238 0 1
## ACH-000239 0 0
## ACH-000240 1 0
## ACH-000242 1 0
## ACH-000244 1 0
## ACH-000246 0 0
## ACH-000247 0 0
## ACH-000248 0 0
## ACH-000249 1 0
## ACH-000250 0 1
## ACH-000252 0 1
## ACH-000253 0 0
## ACH-000255 0 0
## ACH-000256 0 0
## ACH-000257 0 0
## ACH-000258 0 0
## ACH-000259 0 1
## ACH-000260 0 0
## ACH-000261 0 0
## ACH-000263 0 1
## ACH-000264 0 0
## ACH-000265 0 0
## ACH-000269 0 1
## ACH-000270 0 1
## ACH-000271 0 0
## ACH-000272 1 0
## ACH-000273 1 0
## ACH-000275 0 0
## ACH-000276 0 1
## ACH-000277 0 1
## ACH-000278 0 0
## ACH-000280 0 1
## ACH-000281 0 1
## ACH-000282 0 0
## ACH-000285 0 1
## ACH-000286 0 1
## ACH-000288 1 0
## ACH-000290 0 0
## ACH-000291 0 0
## ACH-000292 0 0
## ACH-000293 1 0
## ACH-000294 0 1
## ACH-000296 0 0
## ACH-000297 0 0
## ACH-000298 0 0
## ACH-000300 0 0
## ACH-000301 0 1
## ACH-000302 0 1
## ACH-000303 0 0
## ACH-000304 0 0
## ACH-000306 1 0
## ACH-000307 0 0
## ACH-000308 0 0
## ACH-000310 0 0
## ACH-000311 0 0
## ACH-000312 0 0
## ACH-000313 0 1
## ACH-000314 0 0
## ACH-000315 0 0
## ACH-000317 0 1
## ACH-000318 0 1
## ACH-000320 0 1
## ACH-000322 0 0
## ACH-000323 1 0
## ACH-000324 0 1
## ACH-000325 0 0
## ACH-000327 1 0
## ACH-000329 1 0
## ACH-000330 0 0
## ACH-000332 0 0
## ACH-000333 1 0
## ACH-000334 0 0
## ACH-000335 0 0
## ACH-000336 1 0
## ACH-000338 0 1
## ACH-000341 0 0
## ACH-000343 0 1
## ACH-000344 0 0
## ACH-000345 0 0
## ACH-000347 0 0
## ACH-000348 0 0
## ACH-000349 0 1
## ACH-000350 0 0
## ACH-000351 0 0
## ACH-000352 0 0
## ACH-000353 0 0
## ACH-000354 0 0
## ACH-000355 0 0
## ACH-000356 0 0
## ACH-000358 0 0
## ACH-000359 1 0
## ACH-000360 0 0
## ACH-000361 0 0
## ACH-000362 0 1
## ACH-000363 0 1
## ACH-000364 0 1
## ACH-000365 0 0
## ACH-000366 0 0
## ACH-000367 0 0
## ACH-000368 0 1
## ACH-000373 0 1
## ACH-000374 1 0
## ACH-000375 0 1
## ACH-000376 0 1
## ACH-000378 0 0
## ACH-000379 0 0
## ACH-000380 0 1
## ACH-000381 0 0
## ACH-000382 0 0
## ACH-000383 0 1
## ACH-000384 0 1
## ACH-000386 0 1
## ACH-000388 0 0
## ACH-000389 0 1
## ACH-000390 0 1
## ACH-000391 0 0
## ACH-000392 0 0
## ACH-000393 0 1
## ACH-000394 0 0
## ACH-000395 0 0
## ACH-000396 0 1
## ACH-000397 0 1
## ACH-000399 0 0
## ACH-000400 0 0
## ACH-000401 0 0
## ACH-000403 0 0
## ACH-000404 0 1
## ACH-000406 0 0
## ACH-000407 0 1
## ACH-000409 0 0
## ACH-000410 0 1
## ACH-000411 1 0
## ACH-000412 0 1
## ACH-000414 0 0
## ACH-000415 1 0
## ACH-000416 0 0
## ACH-000417 0 1
## ACH-000418 1 0
## ACH-000419 0 0
## ACH-000420 0 1
## ACH-000421 0 1
## ACH-000422 0 1
## ACH-000423 0 0
## ACH-000424 0 1
## ACH-000427 0 0
## ACH-000429 0 0
## ACH-000430 0 1
## ACH-000431 0 0
## ACH-000433 0 0
## ACH-000434 0 0
## ACH-000435 0 0
## ACH-000437 0 1
## ACH-000438 0 1
## ACH-000441 0 0
## ACH-000442 0 1
## ACH-000443 0 1
## ACH-000444 0 0
## ACH-000445 0 1
## ACH-000446 0 0
## ACH-000447 0 1
## ACH-000448 0 0
## ACH-000449 0 1
## ACH-000450 0 1
## ACH-000451 0 1
## ACH-000453 0 1
## ACH-000454 0 0
## ACH-000456 0 1
## ACH-000457 0 0
## ACH-000458 0 0
## ACH-000459 0 1
## ACH-000460 0 0
## ACH-000463 0 0
## ACH-000464 0 1
## ACH-000465 0 0
## ACH-000466 0 0
## ACH-000468 0 1
## ACH-000469 0 1
## ACH-000470 0 0
## ACH-000472 0 0
## ACH-000473 0 0
## ACH-000475 0 0
## ACH-000476 0 1
## ACH-000477 0 0
## ACH-000478 0 1
## ACH-000479 0 1
## ACH-000480 0 1
## ACH-000481 0 0
## ACH-000482 0 1
## ACH-000483 0 1
## ACH-000484 1 0
## ACH-000485 0 0
## ACH-000487 0 0
## ACH-000488 0 0
## ACH-000489 0 1
## ACH-000490 0 0
## ACH-000491 0 0
## ACH-000493 0 1
## ACH-000495 0 1
## ACH-000496 0 0
## ACH-000501 0 0
## ACH-000502 0 0
## ACH-000503 0 1
## ACH-000504 0 1
## ACH-000505 0 1
## ACH-000506 0 0
## ACH-000507 0 0
## ACH-000508 0 0
## ACH-000510 0 0
## ACH-000511 0 0
## ACH-000514 0 0
## ACH-000515 0 0
## ACH-000517 0 0
## ACH-000518 0 1
## ACH-000520 0 0
## ACH-000521 0 0
## ACH-000522 0 1
## ACH-000523 0 0
## ACH-000524 0 0
## ACH-000525 0 0
## ACH-000526 1 0
## ACH-000527 0 0
## ACH-000528 0 0
## ACH-000530 0 1
## ACH-000532 0 0
## ACH-000534 0 0
## ACH-000535 1 0
## ACH-000536 0 0
## ACH-000538 0 1
## ACH-000539 1 0
## ACH-000541 0 0
## ACH-000542 0 0
## ACH-000544 0 1
## ACH-000545 0 0
## ACH-000546 0 0
## ACH-000547 0 1
## ACH-000548 0 0
## ACH-000550 0 0
## ACH-000551 0 0
## ACH-000552 0 0
## ACH-000553 0 0
## ACH-000554 0 0
## ACH-000555 1 0
## ACH-000557 1 0
## ACH-000558 0 1
## ACH-000559 0 0
## ACH-000561 0 0
## ACH-000562 0 0
## ACH-000563 0 0
## ACH-000564 0 0
## ACH-000565 0 0
## ACH-000566 0 1
## ACH-000568 0 1
## ACH-000569 1 0
## ACH-000570 0 0
## ACH-000571 0 1
## ACH-000572 0 1
## ACH-000573 0 0
## ACH-000574 0 0
## ACH-000576 0 0
## ACH-000577 0 0
## ACH-000578 0 1
## ACH-000579 0 0
## ACH-000580 0 1
## ACH-000581 0 0
## ACH-000582 0 0
## ACH-000584 0 0
## ACH-000585 0 0
## ACH-000586 0 0
## ACH-000587 0 1
## ACH-000588 0 0
## ACH-000589 0 0
## ACH-000590 0 0
## ACH-000593 0 1
## ACH-000594 0 0
## ACH-000595 0 0
## ACH-000596 0 0
## ACH-000598 0 0
## ACH-000599 0 1
## ACH-000600 1 0
## ACH-000601 0 1
## ACH-000603 0 0
## ACH-000606 1 0
## ACH-000607 0 0
## ACH-000608 0 1
## ACH-000609 0 1
## ACH-000610 0 0
## ACH-000611 0 0
## ACH-000613 1 0
## ACH-000614 0 0
## ACH-000616 0 0
## ACH-000617 0 0
## ACH-000619 0 1
## ACH-000620 0 0
## ACH-000621 0 0
## ACH-000622 0 0
## ACH-000623 0 1
## ACH-000624 0 1
## ACH-000625 0 1
## ACH-000627 0 0
## ACH-000628 0 0
## ACH-000631 0 0
## ACH-000632 0 0
## ACH-000633 0 0
## ACH-000635 0 0
## ACH-000637 0 0
## ACH-000638 0 0
## ACH-000639 0 0
## ACH-000640 0 0
## ACH-000643 0 0
## ACH-000644 0 1
## ACH-000645 1 0
## ACH-000646 0 0
## ACH-000647 0 0
## ACH-000648 0 0
## ACH-000649 0 0
## ACH-000650 0 0
## ACH-000651 0 0
## ACH-000652 0 0
## ACH-000653 0 1
## ACH-000655 0 1
## ACH-000656 0 0
## ACH-000657 0 0
## ACH-000658 0 0
## ACH-000659 0 0
## ACH-000660 0 1
## ACH-000662 0 0
## ACH-000663 0 0
## ACH-000665 0 0
## ACH-000666 0 0
## ACH-000667 0 0
## ACH-000668 0 1
## ACH-000669 0 0
## ACH-000670 0 0
## ACH-000672 0 0
## ACH-000674 0 0
## ACH-000675 0 0
## ACH-000677 0 1
## ACH-000678 0 0
## ACH-000679 0 0
## ACH-000680 0 0
## ACH-000681 0 0
## ACH-000683 0 0
## ACH-000684 0 1
## ACH-000685 0 0
## ACH-000686 0 1
## ACH-000688 1 0
## ACH-000691 0 0
## ACH-000692 0 1
## ACH-000693 0 0
## ACH-000694 0 0
## ACH-000695 0 0
## ACH-000696 0 0
## ACH-000697 0 0
## ACH-000698 0 0
## ACH-000699 0 0
## ACH-000701 0 1
## ACH-000703 0 0
## ACH-000704 0 0
## ACH-000705 0 0
## ACH-000706 0 0
## ACH-000708 0 0
## ACH-000709 0 1
## ACH-000710 1 0
## ACH-000711 0 0
## ACH-000712 0 1
## ACH-000713 0 1
## ACH-000714 0 0
## ACH-000718 0 0
## ACH-000719 0 0
## ACH-000720 0 1
## ACH-000721 0 0
## ACH-000722 0 0
## ACH-000724 0 1
## ACH-000725 0 0
## ACH-000729 0 0
## ACH-000730 0 0
## ACH-000731 1 0
## ACH-000732 1 0
## ACH-000733 0 0
## ACH-000734 0 0
## ACH-000735 0 1
## ACH-000736 0 0
## ACH-000737 0 0
## ACH-000738 0 0
## ACH-000739 0 1
## ACH-000740 0 0
## ACH-000741 0 1
## ACH-000743 0 0
## ACH-000744 0 0
## ACH-000745 0 0
## ACH-000746 0 0
## ACH-000747 0 0
## ACH-000748 0 1
## ACH-000749 0 0
## ACH-000750 0 0
## ACH-000752 0 0
## ACH-000753 0 0
## ACH-000755 0 1
## ACH-000756 0 0
## ACH-000758 0 0
## ACH-000759 0 0
## ACH-000761 0 0
## ACH-000763 0 0
## ACH-000764 0 0
## ACH-000765 0 0
## ACH-000766 0 0
## ACH-000767 0 0
## ACH-000768 0 0
## ACH-000769 0 0
## ACH-000771 0 1
## ACH-000774 0 0
## ACH-000775 0 0
## ACH-000776 0 0
## ACH-000777 0 0
## ACH-000778 0 0
## ACH-000780 0 0
## ACH-000781 0 0
## ACH-000782 0 0
## ACH-000783 0 0
## ACH-000784 0 0
## ACH-000785 0 0
## ACH-000787 0 0
## ACH-000788 0 0
## ACH-000789 0 0
## ACH-000790 0 0
## ACH-000791 0 0
## ACH-000792 0 1
## ACH-000793 0 0
## ACH-000794 0 0
## ACH-000796 0 0
## ACH-000797 0 1
## ACH-000798 0 1
## ACH-000799 0 0
## ACH-000800 0 0
## ACH-000802 0 0
## ACH-000803 0 0
## ACH-000804 0 0
## ACH-000805 0 0
## ACH-000808 0 0
## ACH-000809 0 0
## ACH-000810 0 0
## ACH-000811 0 0
## ACH-000812 0 0
## ACH-000813 0 0
## ACH-000815 0 0
## ACH-000816 0 0
## ACH-000817 0 0
## ACH-000818 0 0
## ACH-000819 0 0
## ACH-000820 0 0
## ACH-000821 0 0
## ACH-000822 0 0
## ACH-000823 0 0
## ACH-000824 0 0
## ACH-000825 0 0
## ACH-000826 0 0
## ACH-000828 0 0
## ACH-000830 0 0
## ACH-000831 0 0
## ACH-000832 0 0
## ACH-000833 0 0
## ACH-000834 0 0
## ACH-000835 0 0
## ACH-000837 0 0
## ACH-000838 0 0
## ACH-000839 0 0
## ACH-000840 0 0
## ACH-000841 0 0
## ACH-000842 0 0
## ACH-000843 0 0
## ACH-000844 0 0
## ACH-000845 0 0
## ACH-000846 0 0
## ACH-000847 0 0
## ACH-000848 0 0
## ACH-000849 0 0
## ACH-000850 0 0
## ACH-000851 0 0
## ACH-000852 0 0
## ACH-000853 0 0
## ACH-000855 0 0
## ACH-000856 0 0
## ACH-000857 0 1
## ACH-000858 0 0
## ACH-000859 0 0
## ACH-000860 0 0
## ACH-000861 0 0
## ACH-000862 0 0
## ACH-000863 0 0
## ACH-000864 0 0
## ACH-000865 0 0
## ACH-000866 0 0
## ACH-000867 0 0
## ACH-000868 0 0
## ACH-000869 0 0
## ACH-000870 0 0
## ACH-000871 0 0
## ACH-000873 0 0
## ACH-000874 0 0
## ACH-000875 0 0
## ACH-000876 0 0
## ACH-000877 0 0
## ACH-000878 0 0
## ACH-000879 0 0
## ACH-000880 0 0
## ACH-000881 0 0
## ACH-000882 0 0
## ACH-000883 0 0
## ACH-000884 0 0
## ACH-000885 0 0
## ACH-000886 0 0
## ACH-000888 0 0
## ACH-000889 0 0
## ACH-000890 0 0
## ACH-000891 0 0
## ACH-000892 0 0
## ACH-000893 0 0
## ACH-000894 0 0
## ACH-000895 0 0
## ACH-000896 0 0
## ACH-000897 0 0
## ACH-000898 0 0
## ACH-000899 0 0
## ACH-000900 0 0
## ACH-000901 0 0
## ACH-000902 0 0
## ACH-000903 0 0
## ACH-000904 0 0
## ACH-000906 0 0
## ACH-000907 0 0
## ACH-000908 0 0
## ACH-000909 0 0
## ACH-000910 0 0
## ACH-000911 0 0
## ACH-000912 0 0
## ACH-000913 0 0
## ACH-000915 0 0
## ACH-000916 0 0
## ACH-000919 0 0
## ACH-000921 0 0
## ACH-000924 0 0
## ACH-000925 0 0
## ACH-000926 0 0
## ACH-000927 0 0
## ACH-000928 0 0
## ACH-000929 0 0
## ACH-000930 0 0
## ACH-000931 0 0
## ACH-000932 0 0
## ACH-000934 0 0
## ACH-000936 0 0
## ACH-000938 0 0
## ACH-000939 0 0
## ACH-000940 0 0
## ACH-000941 0 0
## ACH-000942 0 0
## ACH-000943 0 0
## ACH-000945 0 0
## ACH-000946 0 0
## ACH-000947 0 0
## ACH-000948 0 0
## ACH-000949 0 0
## ACH-000950 0 0
## ACH-000951 0 0
## ACH-000952 0 0
## ACH-000953 0 0
## ACH-000954 0 0
## ACH-000955 0 0
## ACH-000956 0 0
## ACH-000957 0 0
## ACH-000958 0 0
## ACH-000960 0 0
## ACH-000961 0 0
## ACH-000962 0 0
## ACH-000963 0 0
## ACH-000965 0 0
## ACH-000966 0 0
## ACH-000967 0 0
## ACH-000968 0 0
## ACH-000969 0 0
## ACH-000971 0 0
## ACH-000972 0 0
## ACH-000973 0 0
## ACH-000974 0 0
## ACH-000976 0 0
## ACH-000977 0 0
## ACH-000978 0 0
## ACH-000980 0 0
## ACH-000981 0 0
## ACH-000982 0 0
## ACH-000984 0 0
## ACH-000985 0 0
## ACH-000987 0 0
## ACH-000989 0 0
## ACH-000990 0 0
## ACH-000993 0 0
## ACH-000994 0 0
## ACH-000995 0 0
## ACH-000996 0 0
## ACH-000997 0 0
## ACH-000998 0 0
## ACH-001001 0 1
## ACH-001041 0 0
## ACH-001048 0 0
## ACH-001061 0 0
## ACH-001075 0 0
## ACH-001078 0 1
## ACH-001106 0 1
## ACH-001113 0 0
## ACH-001129 1 0
## ACH-001145 0 0
## ACH-001151 0 0
## ACH-001163 0 0
## ACH-001184 0 0
## ACH-001190 0 0
## ACH-001192 0 0
## ACH-001194 0 0
## ACH-001200 0 0
## ACH-001210 0 0
## ACH-001229 1 0
## ACH-001239 0 0
## ACH-001277 0 0
## ACH-001278 0 0
## ACH-001283 0 0
## ACH-001306 0 1
## ACH-001307 0 1
## ACH-001318 0 0
## ACH-001321 0 1
## ACH-001328 0 0
## ACH-001329 0 1
## ACH-001332 1 0
## ACH-001333 0 0
## ACH-001334 0 1
## ACH-001335 0 1
## ACH-001336 0 0
## ACH-001339 0 1
## ACH-001340 0 0
## ACH-001341 0 1
## ACH-001344 0 0
## ACH-001345 0 0
## ACH-001346 1 0
## ACH-001347 1 0
## ACH-001353 1 0
## ACH-001354 1 0
## ACH-001356 1 0
## ACH-001360 0 0
## ACH-001366 0 0
## ACH-001367 0 0
## ACH-001368 0 0
## ACH-001369 0 0
## ACH-001370 1 0
## ACH-001373 0 0
## ACH-001374 0 0
## ACH-001375 1 0
## ACH-001376 0 0
## ACH-001377 0 0
## ACH-001378 1 0
## ACH-001379 0 0
## ACH-001380 0 0
## ACH-001382 0 0
## ACH-001384 0 0
## ACH-001385 0 0
## ACH-001386 0 0
## ACH-001388 1 0
## ACH-001389 0 0
## ACH-001390 0 0
## ACH-001391 1 0
## ACH-001392 0 0
## ACH-001394 0 0
## ACH-001395 0 0
## ACH-001396 0 0
## ACH-001398 0 0
## ACH-001399 0 0
## ACH-001400 0 1
## ACH-001401 0 0
## ACH-001402 0 0
## ACH-001403 0 0
## ACH-001407 0 0
## ACH-001408 0 0
## ACH-001409 0 1
## ACH-001410 0 0
## ACH-001411 0 1
## ACH-001412 0 0
## ACH-001413 0 0
## ACH-001414 0 0
## ACH-001415 0 0
## ACH-001416 0 0
## ACH-001418 1 0
## ACH-001419 1 0
## ACH-001421 1 0
## ACH-001422 1 0
## ACH-001433 0 0
## ACH-001441 0 0
## ACH-001442 0 0
## ACH-001443 0 0
## ACH-001450 1 0
## ACH-001451 1 0
## ACH-001453 1 0
## ACH-001454 1 0
## ACH-001456 1 0
## ACH-001458 0 1
## ACH-001459 1 0
## ACH-001460 1 0
## ACH-001461 1 0
## ACH-001484 0 0
## ACH-001485 0 0
## ACH-001494 0 1
## ACH-001495 1 0
## ACH-001496 0 1
## ACH-001497 0 1
## ACH-001498 0 0
## ACH-001500 0 1
## ACH-001509 1 0
## ACH-001510 0 1
## ACH-001511 1 0
## ACH-001513 0 0
## ACH-001515 0 0
## ACH-001516 0 0
## ACH-001517 0 0
## ACH-001518 0 0
## ACH-001519 0 1
## ACH-001520 1 0
## ACH-001521 0 0
## ACH-001522 0 0
## ACH-001523 0 0
## ACH-001524 0 0
## ACH-001525 0 0
## ACH-001526 0 1
## ACH-001528 0 0
## ACH-001529 0 0
## ACH-001530 0 0
## ACH-001532 0 0
## ACH-001536 0 0
## ACH-001538 0 1
## ACH-001539 0 0
## ACH-001540 0 1
## ACH-001541 0 1
## ACH-001542 0 0
## ACH-001543 0 1
## ACH-001548 0 0
## ACH-001549 0 0
## ACH-001550 0 0
## ACH-001551 0 0
## ACH-001552 0 0
## ACH-001554 1 0
## ACH-001555 1 0
## ACH-001556 1 0
## ACH-001557 1 0
## ACH-001558 1 0
## ACH-001559 1 0
## ACH-001560 1 0
## ACH-001561 1 0
## ACH-001562 1 0
## ACH-001563 0 0
## ACH-001566 0 0
## ACH-001567 0 0
## ACH-001568 0 0
## ACH-001569 0 0
## ACH-001570 0 0
## ACH-001573 0 0
## ACH-001574 1 0
## ACH-001577 1 0
## ACH-001578 0 1
## ACH-001603 0 1
## ACH-001605 0 1
## ACH-001607 0 0
## ACH-001608 0 0
## ACH-001609 0 1
## ACH-001610 0 0
## ACH-001611 0 1
## ACH-001613 1 0
## ACH-001616 1 0
## ACH-001617 1 0
## ACH-001618 1 0
## ACH-001619 0 0
## ACH-001622 0 1
## ACH-001623 0 0
## ACH-001624 0 1
## ACH-001625 0 0
## ACH-001626 0 0
## ACH-001627 1 0
## ACH-001628 0 0
## ACH-001630 0 0
## ACH-001632 0 0
## ACH-001634 1 0
## ACH-001636 1 0
## ACH-001638 0 0
## ACH-001642 0 1
## ACH-001645 0 0
## ACH-001647 0 0
## ACH-001648 1 0
## ACH-001649 1 0
## ACH-001650 0 0
## ACH-001651 1 0
## ACH-001652 0 1
## ACH-001653 0 0
## ACH-001654 1 0
## ACH-001655 0 1
## ACH-001656 0 0
## ACH-001664 0 0
## ACH-001668 1 0
## ACH-001670 0 1
## ACH-001673 0 1
## ACH-001674 0 0
## ACH-001677 1 0
## ACH-001685 0 1
## ACH-001687 0 1
## ACH-001688 0 1
## ACH-001690 1 0
## ACH-001692 1 0
## ACH-001694 0 0
## ACH-001698 1 0
## ACH-001699 0 1
## ACH-001702 0 0
## ACH-001703 0 0
## ACH-001709 0 0
## ACH-001711 1 0
## ACH-001719 0 0
## ACH-001740 0 0
## ACH-001765 0 0
## ACH-001786 0 0
## ACH-001794 1 0
## ACH-001807 1 0
## ACH-001814 1 0
## ACH-001818 0 0
## ACH-001819 0 0
## ACH-001820 0 0
## ACH-001850 0 0
## ACH-001861 0 0
## ACH-001961 0 0
## ACH-001991 0 0
## ACH-002011 1 0
## ACH-002015 1 0
## ACH-002016 1 0
## ACH-002017 0 0
## ACH-002018 0 0
## ACH-002019 0 1
## ACH-002022 0 0
## ACH-002023 0 1
## ACH-002024 0 0
## ACH-002025 0 1
## ACH-002026 0 0
## ACH-002027 0 0
## ACH-002029 0 0
## ACH-002038 0 1
## ACH-002039 0 0
## ACH-002041 0 0
## ACH-002042 0 1
## ACH-002044 0 1
## ACH-002045 0 0
## ACH-002046 0 1
## ACH-002059 0 0
## ACH-002062 0 0
## ACH-002065 0 0
## ACH-002066 0 1
## ACH-002067 0 1
## ACH-002069 0 1
## ACH-002446 0 0
## ACH-002508 0 0
## ACH-002509 0 0
## ACH-002510 0 0
## ACH-002511 1 0
## attr(,"assign")
## [1] 1 1 1 1 1 1
## attr(,"contrasts")
## attr(,"contrasts")$Group
## [1] "contr.treatment"
# Make contrasts
contrast <- makeContrasts(MHvsPH = Metastasis.High-Primary.High,
MLvsPL = Metastasis.Low-Primary.Low,
MHvsML = Metastasis.High-Metastasis.Low,
PHvsPL = Primary.High-Primary.Low,
MHvsPL = Metastasis.High-Primary.Low,
PHvsML = Primary.High-Metastasis.Low,
levels=design)
keep <- filterByExpr(y, design)
y <- y[keep, , keep.lib.sizes=FALSE]
barplot(y$samples$lib.size,names=colnames(y),las=2)
# Add a title to the plot
title("Barplot of library sizes")# Calculate normalization factors (TMM)
y <- calcNormFactors(y)
# Start limma voom pipeline
par(mfrow=c(1,1))
v <- voom(y,design,plot = TRUE)# Limma - Trend
fit <- lmFit(v, design)
fit.cont <- contrasts.fit(fit, contrast)
fit.cont <- eBayes(fit.cont, trend = F, robust = F)
summa.fit <- decideTests(fit.cont)
summary(summa.fit)## MHvsPH MLvsPL MHvsML PHvsPL MHvsPL PHvsML
## Down 395 288 1247 3343 3603 1663
## NotSig 24452 25136 22773 18323 16752 22572
## Up 899 322 1726 4080 5391 1511
## logFC AveExpr t P.Value adj.P.Val
## ENSG00000246430 -2.4408491 -5.87807662 -10.159632 3.659674e-23 9.422196e-19
## ENSG00000183305 4.2771331 -2.13010423 9.102870 4.531851e-19 5.833851e-15
## ENSG00000132274 -3.4277661 1.64846236 -9.043341 7.513842e-19 6.448380e-15
## ENSG00000268606 4.4404464 -1.42482359 8.960674 1.509436e-18 9.715482e-15
## ENSG00000277363 2.2837677 1.54239804 8.927157 1.999810e-18 1.029742e-14
## ENSG00000197172 4.4024882 0.05658333 8.689111 1.437818e-17 6.169677e-14
## ENSG00000169857 -0.7769217 3.53459179 -8.455885 9.511237e-17 3.060954e-13
## ENSG00000166710 -1.4130402 9.57623409 -8.468934 8.566925e-17 3.060954e-13
## ENSG00000224271 2.6623087 -3.39097971 8.413178 1.337986e-16 3.827533e-13
## ENSG00000279753 -1.7523331 0.80568349 -8.393601 1.563804e-16 4.026170e-13
## B
## ENSG00000246430 41.19648
## ENSG00000183305 32.21412
## ENSG00000132274 32.00096
## ENSG00000268606 31.10127
## ENSG00000277363 30.98745
## ENSG00000197172 29.01167
## ENSG00000169857 27.34579
## ENSG00000166710 26.86482
## ENSG00000224271 26.74426
## ENSG00000279753 26.73454
# Genome-wide plots
plotMD(fit.cont,coef="MHvsPL",status=summa.fit[,"MHvsPL"], values = c(-1, 1), hl.col=c("blue","red"), main = "MHvsPL")glXYPlot(x=fit.cont$coefficients[,'MHvsPL'], y=fit.cont$lods[,'MHvsPL'],
xlab="logFC", ylab="B", main="MHvsPL",
counts=v$E, groups=Group, status=summa.fit[,'MHvsPL'],
anno=fit.cont$genes, side.main="ENSEMBL", folder="combined_volcano", html = "combined_volcano")
# Testing relative to a threshold - TREAT
fit.treat <- treat(fit.cont,lfc=log2(1.2))
res.treat <- decideTests(fit.treat)
summary(res.treat)## MHvsPH MLvsPL MHvsML PHvsPL MHvsPL PHvsML
## Down 4 13 100 889 950 231
## NotSig 25695 25714 25477 23973 23131 25422
## Up 47 19 169 884 1665 93
## logFC AveExpr t P.Value adj.P.Val
## ENSG00000246430 -2.440849 -5.87807662 -9.064795 3.132069e-19 8.063824e-15
## ENSG00000183305 4.277133 -2.13010423 8.543063 2.358977e-17 3.036711e-13
## ENSG00000268606 4.440446 -1.42482359 8.429879 5.855893e-17 5.025527e-13
## ENSG00000132274 -3.427766 1.64846236 -8.349387 1.110790e-16 7.149602e-13
## ENSG00000197172 4.402488 0.05658333 8.169965 4.544392e-16 2.339998e-12
## ENSG00000277363 2.283768 1.54239804 7.898966 3.630208e-15 1.557722e-11
## ENSG00000224271 2.662309 -3.39097971 7.581961 3.824443e-14 1.406630e-10
## ENSG00000221867 4.203742 0.53575432 7.551791 4.766560e-14 1.533998e-10
## ENSG00000213401 3.855132 -0.96422466 7.517475 6.113278e-14 1.748805e-10
## ENSG00000279753 -1.752333 0.80568349 -7.133677 9.265418e-13 2.385475e-09
## logFC AveExpr t P.Value adj.P.Val
## ENSG00000246430 -2.079330 -5.878076622 -8.019834 1.447411e-15 3.726503e-11
## ENSG00000132274 -2.738173 1.648462364 -7.483681 7.803565e-14 1.004553e-09
## ENSG00000231240 2.609696 -4.038266083 7.163147 7.552882e-13 4.465378e-09
## ENSG00000169554 -3.097098 1.664564089 -7.143501 8.656336e-13 4.465378e-09
## ENSG00000154175 -2.438812 -0.003401133 -7.143234 8.671984e-13 4.465378e-09
## ENSG00000180178 3.159914 -1.458796036 6.935929 3.582940e-12 1.537440e-08
## ENSG00000126947 -3.045818 0.075321107 -6.908264 4.317650e-12 1.588032e-08
## ENSG00000251026 2.411530 -3.743433473 6.790641 9.473904e-12 3.048939e-08
## ENSG00000221852 -2.110220 -5.236553221 -6.621908 2.865415e-11 8.196998e-08
## ENSG00000224271 2.176765 -3.390979711 6.459792 8.109674e-11 2.087917e-07
# MHvsPL plot
plotMD(fit.treat,coef="MHvsPL",status=res.treat[,"MHvsPL"], values = c(-1, 1), hl.col=c("blue","red"), main = "MHvsPL (TREAT)")glMDPlot(fit.treat, coef="MHvsPL", counts=v$E, groups=Group,
status=res.treat, side.main="ENSEMBL", main="MHvsPL",
folder="combined_md", html = "combined_md")## Warning in makeAnno(x, anno): No gene annotation provided.
# PHvsPL
plotMD(fit.treat,coef="PHvsPL",status=res.treat[,"PHvsPL"], values = c(-1, 1), hl.col=c("blue","red"), main = "PHvsPL (TREAT)")# GO enrichment using goana on Treat results
# GO enrichment using gprofiler2::gost for MHvsPL
de_genes_MHvsPL_raw <- rownames(topTreat(fit.treat, coef="MHvsPL", number=Inf))[topTreat(fit.treat, coef="MHvsPL", number=Inf)$adj.P.Val < 0.05]
de_genes_MHvsPL <- stringr::str_extract(de_genes_MHvsPL_raw, "ENSG[0-9]+")
de_genes_MHvsPL <- na.omit(de_genes_MHvsPL)
go_MHvsPL <- gost(query = de_genes_MHvsPL, organism = "hsapiens", sources = c("GO:BP", "GO:MF", "GO:CC"))
head(go_MHvsPL$result, 10)## query significant p_value term_size query_size intersection_size
## 1 query_1 TRUE 5.709244e-49 7322 1487 794
## 2 query_1 TRUE 4.044119e-40 6553 1487 709
## 3 query_1 TRUE 4.384922e-39 5997 1487 662
## 4 query_1 TRUE 1.324471e-38 4053 1487 499
## 5 query_1 TRUE 4.869980e-38 4727 1487 555
## 6 query_1 TRUE 1.525063e-37 1530 1487 256
## 7 query_1 TRUE 3.364042e-37 6540 1487 699
## 8 query_1 TRUE 3.490296e-37 6515 1487 697
## 9 query_1 TRUE 1.395714e-33 8999 1487 871
## 10 query_1 TRUE 1.069665e-31 6002 1487 639
## precision recall term_id source term_name
## 1 0.5339610 0.10844032 GO:0032501 GO:BP multicellular organismal process
## 2 0.4767989 0.10819472 GO:0032502 GO:BP developmental process
## 3 0.4451917 0.11038853 GO:0048856 GO:BP anatomical structure development
## 4 0.3355750 0.12311868 GO:0048731 GO:BP system development
## 5 0.3732347 0.11741062 GO:0007275 GO:BP multicellular organism development
## 6 0.1721587 0.16732026 GO:0007155 GO:BP cell adhesion
## 7 0.4700740 0.10688073 GO:0007154 GO:BP cell communication
## 8 0.4687290 0.10698388 GO:0023052 GO:BP signaling
## 9 0.5857431 0.09678853 GO:0050896 GO:BP response to stimulus
## 10 0.4297243 0.10646451 GO:0007165 GO:BP signal transduction
## effective_domain_size source_order parents
## 1 21026 7557 GO:0008150
## 2 21026 7558 GO:0008150
## 3 21026 12931 GO:0032502
## 4 21026 12821 GO:00072....
## 5 21026 2788 GO:00325....
## 6 21026 2706 GO:0009987
## 7 21026 2705 GO:0009987
## 8 21026 6607 GO:0050789
## 9 21026 13167 GO:0008150
## 10 21026 2716 GO:00071....
# GO enrichment using gprofiler2::gost for PHvsPL
de_genes_PHvsPL_raw <- rownames(topTreat(fit.treat, coef="PHvsPL", number=Inf))[topTreat(fit.treat, coef="PHvsPL", number=Inf)$adj.P.Val < 0.05]
de_genes_PHvsPL <- stringr::str_extract(de_genes_PHvsPL_raw, "ENSG[0-9]+")
de_genes_PHvsPL <- na.omit(de_genes_PHvsPL)
go_PHvsPL <- gost(query = de_genes_PHvsPL, organism = "hsapiens", sources = c("GO:BP", "GO:MF", "GO:CC"))
head(go_PHvsPL$result, 10)## query significant p_value term_size query_size intersection_size
## 1 query_1 TRUE 3.254949e-34 1534 1063 201
## 2 query_1 TRUE 1.624975e-32 7322 1063 564
## 3 query_1 TRUE 3.787749e-32 2713 1063 284
## 4 query_1 TRUE 2.458910e-31 1530 1063 195
## 5 query_1 TRUE 4.197300e-31 5997 1063 486
## 6 query_1 TRUE 1.632578e-30 4053 1063 368
## 7 query_1 TRUE 2.893900e-30 6553 1063 515
## 8 query_1 TRUE 4.047618e-30 960 1063 145
## 9 query_1 TRUE 1.036108e-29 1793 1063 212
## 10 query_1 TRUE 6.441011e-29 1255 1063 168
## precision recall term_id source term_name
## 1 0.1890875 0.13102999 GO:0016477 GO:BP cell migration
## 2 0.5305738 0.07702813 GO:0032501 GO:BP multicellular organismal process
## 3 0.2671684 0.10468116 GO:0009653 GO:BP anatomical structure morphogenesis
## 4 0.1834431 0.12745098 GO:0007155 GO:BP cell adhesion
## 5 0.4571966 0.08104052 GO:0048856 GO:BP anatomical structure development
## 6 0.3461900 0.09079694 GO:0048731 GO:BP system development
## 7 0.4844779 0.07858996 GO:0032502 GO:BP developmental process
## 8 0.1364064 0.15104167 GO:0030334 GO:BP regulation of cell migration
## 9 0.1994356 0.11823759 GO:0048870 GO:BP cell motility
## 10 0.1580433 0.13386454 GO:0040011 GO:BP locomotion
## effective_domain_size source_order parents
## 1 21026 5204 GO:0048870
## 2 21026 7557 GO:0008150
## 3 21026 3557 GO:00325....
## 4 21026 2706 GO:0009987
## 5 21026 12931 GO:0032502
## 6 21026 12821 GO:00072....
## 7 21026 7558 GO:0008150
## 8 21026 6743 GO:00164....
## 9 21026 12944 GO:0009987
## 10 21026 9808 GO:0008150
Multiple contrasts across combined groups were evaluated. After filtering 20,307 genes remained. MHvsPL had many differential genes; TREAT reduced counts but still left substantial numbers. GO analyses reflect underexpression of immune response terms for MHvsPL and changes related to ECM and cell migration for PHvsPL.
## New exp matrix for MH and PL lineages
comb_exp = as.data.frame(cbind(Group, t(exp_matrix)))
select = c("4", "5")
comb_exp$Group = as.factor(comb_exp$Group)
comb_exp = comb_exp[comb_exp$Group %in% select, ]
comb_exp$Group = as.factor(comb_exp$Group)
comb_exp$Group <- factor(comb_exp$Group, labels = c("Primary.High", "Primary.Low"))
groups_combine <- comb_exp$Group
# Get differentially expressed genes for contrast PHvsPL
genes_diff = topTreat(fit.treat, coef="PHvsPL", number=100)
# Filter the expression matrix to consider these genes
indexes = which(names(comb_exp)%in%rownames(genes_diff))
exp_mut = comb_exp[,indexes]
logcounts_genes <- cpm(t(exp_mut), log=TRUE)We selected 100 top genes from PHvsPL and built a matrix for 403 primary cell lines labeled Primary.High or Primary.Low, generating log CPM values.
# choose colors
mypalette <- brewer.pal(11,"RdYlBu")
morecols <- colorRampPalette(mypalette)
col.cell <- c("darkslategray1", "coral")[groups_combine]
heatmap.2(logcounts_genes,col=rev(morecols(50)), trace="none", main="100 differentially expressed genes PHvsPL", ColSideColors = col.cell, scale="row")
legend(-0.1, -0.125, legend=levels(groups_combine), fill=c("darkslategray1", "coral"), xpd = T)The heatmap shows clusters with distinct expression patterns enriched for Primary.High vs Primary.Low in some clusters.
res.pca<-PCA(t(logcounts_genes), scale.unit = F, graph = F)
eig.val <- get_eigenvalue(res.pca)
eig.val[1:10,]## eigenvalue variance.percent cumulative.variance.percent
## Dim.1 223.43053 25.228846 25.22885
## Dim.2 83.20050 9.394655 34.62350
## Dim.3 56.93482 6.428843 41.05234
## Dim.4 39.36907 4.445391 45.49774
## Dim.5 34.19801 3.861497 49.35923
## Dim.6 27.98429 3.159870 52.51910
## Dim.7 20.72826 2.340549 54.85965
## Dim.8 19.14781 2.162091 57.02174
## Dim.9 17.31825 1.955505 58.97725
## Dim.10 16.51059 1.864307 60.84155
# Number of dimensions until we reach 90% cumulative variance
i = 1
while ( eig.val[i,3] < 90 ) i = i + 1
i## [1] 47
fviz_famd_ind(res.pca, geom = c("point"), col.ind = "cos2", gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
palette = "rainbow", addEllipses = FALSE, ellipse.type = "confidence",
ggtheme = theme_minimal(), repel = TRUE, labels = F) fviz_famd_ind(res.pca, geom = c("point"),
habillage = groups_combine, # color by groups
palette = c("rainbow"),
addEllipses = FALSE, ellipse.type = "confidence",
repel = TRUE, invisible = "quali.var", labels = FALSE )PCA shows first three components explain ~40% variance; some trend of Primary.High grouping on negative PC1, but not a clear separation.
kmm_genes <- kmeans(t(logcounts_genes), 2)
col.cell <- c( "coral","darkslategray1")[groups_combine]
col.cell_Kmm <- c("darkslategray1", "coral")[kmm_genes$cluster]
plot(res.pca$ind$coord[, 1], res.pca$ind$coord[, 2], col = col.cell_Kmm, pch = 16, xlab = "PC1", ylab = "PC2", main ="Map of predicted clusters")
legend(27, -25, legend=c("1", "2"),fill=c("darkslategray1", "coral"), xpd = T)plot(res.pca$ind$coord[, 1], res.pca$ind$coord[, 2], col = col.cell, pch = 16, xlab = "PC1", ylab = "PC2", main ="Map of observed data")
legend(17, -25, legend=levels(groups_combine),fill=c( "coral","darkslategray1"), xpd = T)De forma a efetuar um k-means clustering sobre o dataset afim de encontrar os clusters teóricos, foi construído um silhouette plot para verificar o número ótimo de clusters a utilizar, tendo-se optado por \(2\) clusters.
De seguida, foi efetuado o clustering em si tendo resultado em 2 grupos que se separam uniformemente ao longo do primeiro componente.
A comparação destes clusters com aqueles obtidos pela sobreposição da informação dos grupos reforça a conclusão anterior da má separação das linhagens por esta característica ao longo do mapa fatorial.
Mediante os resultados antes obtidos, propusemo-nos a construir um modelo de machine learning para efetuar a previsão da ação da simvastatina sobre as linhagens celulares estando isto representado na classificação das linhagens como tendo uma baixa ação (“Low”) ou uma alta ação (“High”) da simvastatina. Foram usados \(100\) genes preditores provenientes da expressão diferencial, juntamente com \(559\) linhagens cancerígenas.
Para tal foram testadas 5 frameworks: “Neural Networks”, “Random Forest”, “Decision Trees”, “K Nearest Neighbors” e “Naïve Bayes”.
Em todos os modelos foi utilizada uma estimação do erro corresponde a validação cruzada com 10 folds repetida 5 vezes, aquando da sua construção. De forma a tornar estes resultados replicáveis, foi invocada uma seed de valor “16718”.
#seed para ser sempre igual
set.seed(16718)
#repeated cross validation
cv.control = trainControl('repeatedcv', number = 10, repeats = 5)
#dataframe ML
ml_phpl <- as.data.frame(cbind(group = groups_combine, t(logcounts_genes)))
ml_phpl$group = as.factor(ml_phpl$group)#Neural Networks -- Classification, Regression
set.seed(16718)
phpl_nnet_cv = train(group~., data = ml_phpl, method = "nnet", tuneLenght=10, trControl=cv.control)## Neural Network
##
## 403 samples
## 100 predictors
## 2 classes: '1', '2'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold, repeated 5 times)
## Summary of sample sizes: 363, 363, 362, 363, 363, 362, ...
## Resampling results across tuning parameters:
##
## size decay Accuracy Kappa
## 1 0e+00 0.6622483 0.15694938
## 1 1e-04 0.6356751 0.09596943
## 1 1e-01 0.7445503 0.46214191
## 3 0e+00 0.7199278 0.37808896
## 3 1e-04 0.7392211 0.41720434
## 3 1e-01 0.7407724 0.44808694
## 5 0e+00 0.7403327 0.42262363
## 5 1e-04 0.7412486 0.44878799
## 5 1e-01 0.7432389 0.45620617
##
## Accuracy was used to select the optimal model using the largest value.
## The final values used for the model were size = 1 and decay = 0.1.
Na construção do modelo por “Neural Networks” foi dado o argumento “tuneLenght = 10”, correspondendo isto ao número combinações aleatórias de hiperparâmetros possíveis a utilizar pelo algoritmo na sua construção.
Obtivemos que os parâmetros selecionados correspondem a um “size” 1 (número de unidades na camada intermédia) e a um “decay” (regularização para evitar sobreajustamento) de 0.1, possuindo o modelo uma precisão de aproximadamente \(0.62\).
#Random Forest -- Classification, Regression
set.seed(16718)
phpl_rf_cv = train(group~., data = ml_phpl, method = "rf", tuneLenght=10, trControl=cv.control)
phpl_rf_cv## Random Forest
##
## 403 samples
## 100 predictors
## 2 classes: '1', '2'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold, repeated 5 times)
## Summary of sample sizes: 363, 363, 362, 363, 363, 362, ...
## Resampling results across tuning parameters:
##
## mtry Accuracy Kappa
## 2 0.7809500 0.5227332
## 51 0.7845744 0.5370620
## 100 0.7810231 0.5284431
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was mtry = 51.
Na construção do modelo por “Random Forest” foi dado o argumento “tuneLenght = 10”, correspondendo isto ao número combinações aleatórias de hiperparâmetros possíveis a utilizar pelo algoritmo na sua construção.
Obtivemos que os parâmetros selecionados correspondem a um “mtry” de 2 (número de variáveis randomizadamente amostradas como candidatas a cada ramificação), possuindo o modelo uma precisão de aproximadamente \(0.63\).
#Decision Trees -- Classification, Regression
set.seed(16718)
phpl_tree_cv = train(group~., data = ml_phpl, method = "rpart", tuneGrid=expand.grid(.cp=0.2), trControl=cv.control)
phpl_tree_cv## CART
##
## 403 samples
## 100 predictors
## 2 classes: '1', '2'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold, repeated 5 times)
## Summary of sample sizes: 363, 363, 362, 363, 363, 362, ...
## Resampling results:
##
## Accuracy Kappa
## 0.7055594 0.3844856
##
## Tuning parameter 'cp' was held constant at a value of 0.2
Na construção do modelo por “Decision Trees” foi dado o argumento “expand.grid = (.cp=0.2)”, correspondendo isto ao número combinações aleatórias de hiperparâmetros possíveis a utilizar pelo algoritmo na sua construção.
Obtivemos que os parâmetros selecionados correspondem a um “cp” a um valor constante de 0.2 (complexidade do parâmetro e dos valores de erro associados), possuindo o modelo uma precisão de aproximadamente \(0.58\).
#KNN -- Classification, Regression
set.seed(16718)
phpl_knn_cv = train(group~., data = ml_phpl, method = "knn", tuneGrid = expand.grid(k = 1:10), trControl=cv.control)
phpl_knn_cv## k-Nearest Neighbors
##
## 403 samples
## 100 predictors
## 2 classes: '1', '2'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold, repeated 5 times)
## Summary of sample sizes: 363, 363, 362, 363, 363, 362, ...
## Resampling results across tuning parameters:
##
## k Accuracy Kappa
## 1 0.7123027 0.4017940
## 2 0.7148906 0.4105248
## 3 0.7381620 0.4513294
## 4 0.7347602 0.4452555
## 5 0.7401742 0.4519324
## 6 0.7529278 0.4817922
## 7 0.7565553 0.4916879
## 8 0.7615450 0.5005876
## 9 0.7576041 0.4946542
## 10 0.7546901 0.4870555
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was k = 8.
Na construção do modelo por “k Nearest Neighbours” foi dado o argumento “tuneLenght = expand.grid = (1:10)”, ou seja o número de vizinhos a ser implementados pelo algoritmo.
Obtivemos que os parâmetros selecionados correspondem a um “k” de 10 (número de vizinhos), possuindo o modelo uma precisão de aproximadamente \(0.62\).
#Naive Bayes -- Classification
set.seed(16718)
phpl_nb_cv = train(group~., data = ml_phpl, method = "nb", trControl=cv.control)
phpl_nb_cv## Naive Bayes
##
## 403 samples
## 100 predictors
## 2 classes: '1', '2'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold, repeated 5 times)
## Summary of sample sizes: 363, 363, 362, 363, 363, 362, ...
## Resampling results across tuning parameters:
##
## usekernel Accuracy Kappa
## FALSE 0.7663962 0.5205076
## TRUE 0.6727630 0.3830728
##
## Tuning parameter 'fL' was held constant at a value of 0
## Tuning
## parameter 'adjust' was held constant at a value of 1
## Accuracy was used to select the optimal model using the largest value.
## The final values used for the model were fL = 0, usekernel = FALSE and adjust
## = 1.
Na construção do modelo por “Naïve Bayes” foi dado o argumento “tuneLenght = 10”, correspondendo isto ao número combinações aleatórias de hiperparâmetros possíveis a utilizar pelo algoritmo na sua construção.
Obtivemos que os parâmetros selecionados correspondem a um “fL” de 0 (incorporação da suavização de Laplace), a ausência do uso de um “kernel” (uso de uma kernel density estimate) e um “adjust = 1” (ajuste da largura de banda da da kernel density), possuindo o modelo uma precisão de aproximadamente \(0.63\).
Após observação da precisão dos 5 modelos para a classificação das linhagens cancerígenas, selecionou-se o modelo obtido a partir do algoritmo de “Random Forest” como aquele com maior precisão entre os restantes, tomando esta o valor de \(63\%\).
Este valor é subótimo para a performance de um algoritmo de machine learning, contudo, pode ser melhorado com a adição de mais entradas de dados, ou com enriquecimento de genes ou de outras variáveis que permitam melhor classificar estas classes.
control <- rfeControl(functions=rfFuncs, method="cv", number=10)
results <- rfe(group~., data = ml_phpl, rfeControl=control, sizes=c(1:10,20,40,60,80,100))
results##
## Recursive feature selection
##
## Outer resampling method: Cross-Validated (10 fold)
##
## Resampling performance over subset size:
##
## Variables Accuracy Kappa AccuracySD KappaSD Selected
## 1 0.5806 0.1426 0.02692 0.04867
## 2 0.6648 0.3005 0.07813 0.15471
## 3 0.6973 0.3614 0.06127 0.13807
## 4 0.7172 0.4052 0.04194 0.09405
## 5 0.7175 0.4024 0.06071 0.13420
## 6 0.7397 0.4453 0.04556 0.10623
## 7 0.7469 0.4575 0.05431 0.12263
## 8 0.7417 0.4453 0.07228 0.17200
## 9 0.7367 0.4382 0.06721 0.15413
## 10 0.7364 0.4312 0.06983 0.16436
## 20 0.7619 0.4828 0.09205 0.21149
## 40 0.7590 0.4806 0.08310 0.17725
## 60 0.7864 0.5387 0.08202 0.17876 *
## 80 0.7688 0.4979 0.07998 0.17628
## 100 0.7864 0.5361 0.08368 0.18491
##
## The top 5 variables (out of 60):
## ENSG00000231864, ENSG00000184307, ENSG00000271580, ENSG00000224271, ENSG00000170293
important_genes <- predictors(results)
# Run gprofiler2 enrichment using ENSEMBL IDs
go_res <- gost(query = important_genes, organism = "hsapiens", sources = c("GO:BP", "GO:MF", "GO:CC"))
# Show top results
head(go_res$result, 10)## query significant p_value term_size query_size intersection_size
## 1 query_1 TRUE 0.003506219 2 32 2
## 2 query_1 TRUE 0.013606114 327 32 6
## 3 query_1 TRUE 0.013844862 328 32 6
## 4 query_1 TRUE 0.014087002 329 32 6
## 5 query_1 TRUE 0.020997309 4 32 2
## 6 query_1 TRUE 0.042954982 397 31 5
## 7 query_1 TRUE 0.035644200 10 30 2
## 8 query_1 TRUE 0.038914221 167 30 4
## precision recall term_id source
## 1 0.06250000 1.00000000 GO:1904328 GO:BP
## 2 0.18750000 0.01834862 GO:0030198 GO:BP
## 3 0.18750000 0.01829268 GO:0043062 GO:BP
## 4 0.18750000 0.01823708 GO:0045229 GO:BP
## 5 0.06250000 0.50000000 GO:1990764 GO:BP
## 6 0.16129032 0.01259446 GO:0062023 GO:CC
## 7 0.06666667 0.20000000 GO:0098973 GO:MF
## 8 0.13333333 0.02395210 GO:0005201 GO:MF
## term_name
## 1 regulation of myofibroblast contraction
## 2 extracellular matrix organization
## 3 extracellular structure organization
## 4 external encapsulating structure organization
## 5 myofibroblast contraction
## 6 collagen-containing extracellular matrix
## 7 structural constituent of postsynaptic actin cytoskeleton
## 8 extracellular matrix structural constituent
## effective_domain_size source_order parents
## 1 21026 23353 GO:19031....
## 2 21026 6674 GO:00430....
## 3 21026 10414 GO:0016043
## 4 21026 11178 GO:0016043
## 5 21026 24840 GO:0070252
## 6 22149 2296 GO:0031012
## 7 20246 8535 GO:00052....
## 8 20246 1321 GO:0005198
Por último, foi realizada a seleção de atributos para o dataset considerado para machine learning, através do método de Recursive Feature Elimination. Este algoritmo foi implementado com uma estimação do erro correspondente a validação cruzada 10 folds.
Através do argumento “sizes”, foi dado ao algoritmo o tamanho das amostragens de variáveis a testar. Foi então determinado que dos \(100\) genes fornecidos, \(4\) permitiam uma melhor precisão do algoritmo.
A análise das designações destes genes revelam-nos, consistentemente, um conjunto de genes predominantemente ligados a funções relacionadas com a membrana celular e com tecidos epiteliais, o que é consistente com as análises anteriores.